Loading…

Offline reinforcement learning control for electricity and heat coordination in a supercritical CHP unit

With a high proportion of renewable energy generation connected to the power grid, the combined heat and power (CHP) units need to have flexible operation and control capabilities over a wide range of variable load conditions. In China, the proportion of supercritical combined heat and power (S-CHP)...

Full description

Saved in:

Bibliographic Details
Published in:	Energy (Oxford) 2023-03, Vol.266, p.126485, Article 126485
Main Authors:	Zhang, Guangming, Zhang, Chao, Wang, Wei, Cao, Huan, Chen, Zhenyu, Niu, Yuguang
Format:	Article
Language:	English
Subjects:	Combined heat and power Data-driven modeling electricity–heat coordinated control Offline reinforcement learning Variable load conditions
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	With a high proportion of renewable energy generation connected to the power grid, the combined heat and power (CHP) units need to have flexible operation and control capabilities over a wide range of variable load conditions. In China, the proportion of supercritical combined heat and power (S-CHP) units gradually increases due to their high thermal efficiency. In this paper, we propose a data-driven environment modeling method and an offline reinforcement learning-based electricity–heat coordinated control approach for the wide and flexible load adjustment capabilities of the S-CHP unit. First of all, a modeling method based on the multiple multilayer perceptron (MLP) ensemble is proposed to address the possible over-fitting produced by a single MLP. Then, the state and reward are set considering the dynamic characteristics of the S-CHP unit. Moreover, policy training is implemented through a soft actor-critic algorithm with a maximum entropy model to ensure more robust search capabilities under variable load conditions and targets. The simulation results show that the generalization ability of the environment in the multiple MLP ensemble mode is more substantial than that in the single MLP mode. In addition, when the electric load command is between 267 MW and 325 MW, the offline reinforcement learning can significantly reduce the integral of absolute error matrices of the output parameters, demonstrating that the proposed strategy can achieve electricity–heat coordinated control under variable load conditions. •Offline RL ensures the electricity–heat coordinated control in variable loads.•Ensemble MLP structure implements the model with better generalization ability.•The state is selected considering the dynamic response of the S-CHP unit.•The reward is designed in power-dominated and heat-dominated modes.•The IAE metrics of system outputs are improved by the proposed policy.
ISSN:	0360-5442
DOI:	10.1016/j.energy.2022.126485