Loading…
An Online Reinforcement Learning-based Energy Management Strategy for Microgrids with Centralized Control
To address the issue of significant unpredictability and intermittent nature of renewable energy sources, particularly wind and solar power, this paper introduces a novel optimization model based on online reinforcement learning. Initially, an energy management optimization model is designed to achi...
Saved in:
Published in: | IEEE transactions on industry applications 2024-07, p.1-10 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | To address the issue of significant unpredictability and intermittent nature of renewable energy sources, particularly wind and solar power, this paper introduces a novel optimization model based on online reinforcement learning. Initially, an energy management optimization model is designed to achieve plan adherence and minimize energy storage (ES) operation costs, taking into account the inherent challenges of wind power-photovoltaic energy storage systems (WPESS). An online reinforcement learning framework is employed, which defines various state variables, action variables, and reward functions for the energy management optimization model. The state-action-reward-state-action (SARSA) algorithm is applied to learn the joint scheduling strategy for the microgrid system, utilizing its iterative exploration mechanisms and interaction with the environment. This strategy aims to accomplish the goals of effective power tracking and reduction of storage charging and discharging. The proposed method's effectiveness is validated using a residential community with electric vehicle (EV) charging loads as a test case. Numerical analyses demonstrate that the approach is not reliant on traditional mathematical models and adapts well to the uncertainties and complex constraints of the WPESS, maintaining a low-cost requirement while achieving computational efficiency significantly higher than that of the model predictive control (MPC) and deep Q-network (DQN) algorithm. |
---|---|
ISSN: | 0093-9994 1939-9367 |
DOI: | 10.1109/TIA.2024.3430264 |