Loading…

Actor-critic learning for optimal building energy management with phase change materials

•Phase change materials (PCM) improve the thermal performance of buildings.•Energy management is computationally challenging due to PCM’s nonlinearity.•A computationally efficient method is necessary for plug-and-play implementation.•Actor-critic reinforcement learning based on deep deterministic po...

Full description

Saved in:

Bibliographic Details
Published in:	Electric power systems research 2020-11, Vol.188, p.106543, Article 106543
Main Authors:	Rahimpour, Zahra, Verbič, Gregor, Chapman, Archie C.
Format:	Article
Language:	English
Subjects:	Actor-critic Algorithms Approximate dynamic programming Building design Buildings Construction materials Deep deterministic policy gradient Dynamic programming Energy efficiency Energy management Home energy management Learning Phase change materials Phase transitions Thermal analysis Thermal energy
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Phase change materials (PCM) improve the thermal performance of buildings.•Energy management is computationally challenging due to PCM’s nonlinearity.•A computationally efficient method is necessary for plug-and-play implementation.•Actor-critic reinforcement learning based on deep deterministic policy gradient.•Benchmark against dynamic programming with access to building’s thermal model. Energy management in buildings using phase change materials (PCM) to improve thermal performance is challenging due to the nonlinear thermal capacity of the PCM. To address this problem, this paper adopts a model-free actor-critic on-policy reinforcement learning method based on deep deterministic policy gradient (DDPG). The proposed approach overcomes the major weakness of model-based approaches, such as approximate dynamic programming (ADP), which require an explicit thermal model of the building under control. This requirement makes a plug-and-play implementation of the energy management algorithm in an existing smart meter difficult due to the wide variety of building design and construction types. To overcome this difficulty, we use a DDPG algorithm that can learn policies in continuous action spaces without access to the full dynamics of the building. We demonstrate the competitive performance of DDPG by benchmarking it against an ADP-based approach with access to the full thermal dynamics of the building.
ISSN:	0378-7796 1873-2046
DOI:	10.1016/j.epsr.2020.106543