Loading…
Proximal policy optimization through a deep reinforcement learning framework for remedial action schemes of VSC-HVDC
•VSC-HVDC is firstly used in a goal-oriented control scheme with a deep reinforcement learning model, thereby achieve better total voltage regulation.•To improve training stability, proximal policy optimization is used for learning algorithms, which adopts the trust-region concept.•To configure the...
Saved in:
Published in: | International journal of electrical power & energy systems 2023-08, Vol.150, p.109117, Article 109117 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •VSC-HVDC is firstly used in a goal-oriented control scheme with a deep reinforcement learning model, thereby achieve better total voltage regulation.•To improve training stability, proximal policy optimization is used for learning algorithms, which adopts the trust-region concept.•To configure the reward functions, an iterative two parallel optimal power flow calculation are implemented in the learning framework.•Advanced remedial action schemes allow flexibility and redundancy even with unforeseen power change of renewable energy.
A proximal policy optimization (PPO)-based back-to-back VSC-HVDC emergency control strategy based on multi-agent deep reinforcement learning (DRL) approach is proposed for use in an energy management system (EMS). In this scheme, an advanced DRL algorithm is proposed by implementing both PPO and a communication neural network for large power systems. The PPO modeled as intelligent agents with objective functions have shown a higher convergence performance than have existing DRL algorithms. Further, the model was demonstrated to effectively address voltage variances caused by the high penetration of renewable energy sources. By implementing PPO, the learning procedure is stabilized and made robust to continuous changes in network topology. To escalate the effectiveness of the proposed algorithm, a comprehensive case studies were conducted on an standard test systems and Korean power system considering variations in load and PV generation and a weak centralized communication environment. The results indicate that outstanding control performance and autonomously regulated bus voltage and line flows, thereby validating the effectiveness of the method. |
---|---|
ISSN: | 0142-0615 1879-3517 |
DOI: | 10.1016/j.ijepes.2023.109117 |