Loading…

An Online Reinforcement Learning-based Energy Management Strategy for Microgrids with Centralized Control

To address the issue of significant unpredictability and intermittent nature of renewable energy sources, particularly wind and solar power, this paper introduces a novel optimization model based on online reinforcement learning. Initially, an energy management optimization model is designed to achi...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on industry applications 2024-07, p.1-10
Main Authors: Meng, Qinglin, Hussain, Sheharyar, Luo, Fengzhang, Wang, Zhongguan, Jin, Xiaolong
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To address the issue of significant unpredictability and intermittent nature of renewable energy sources, particularly wind and solar power, this paper introduces a novel optimization model based on online reinforcement learning. Initially, an energy management optimization model is designed to achieve plan adherence and minimize energy storage (ES) operation costs, taking into account the inherent challenges of wind power-photovoltaic energy storage systems (WPESS). An online reinforcement learning framework is employed, which defines various state variables, action variables, and reward functions for the energy management optimization model. The state-action-reward-state-action (SARSA) algorithm is applied to learn the joint scheduling strategy for the microgrid system, utilizing its iterative exploration mechanisms and interaction with the environment. This strategy aims to accomplish the goals of effective power tracking and reduction of storage charging and discharging. The proposed method's effectiveness is validated using a residential community with electric vehicle (EV) charging loads as a test case. Numerical analyses demonstrate that the approach is not reliant on traditional mathematical models and adapts well to the uncertainties and complex constraints of the WPESS, maintaining a low-cost requirement while achieving computational efficiency significantly higher than that of the model predictive control (MPC) and deep Q-network (DQN) algorithm.
ISSN:0093-9994
1939-9367
DOI:10.1109/TIA.2024.3430264