Loading…

Solving dynamic distribution network reconfiguration using deep reinforcement learning

Distribution network reconfiguration, as a part of the distribution management system, plays an important role in increasing the energy efficiency of the distribution network by coordinating the operations of the switches in the distribution network. Dynamic distribution network reconfiguration (DDN...

Full description

Saved in:
Bibliographic Details
Published in:Electrical engineering 2022, Vol.104 (3), p.1487-1501
Main Authors: Kundačina, Ognjen B., Vidović, Predrag M., Petković, Milan R.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Distribution network reconfiguration, as a part of the distribution management system, plays an important role in increasing the energy efficiency of the distribution network by coordinating the operations of the switches in the distribution network. Dynamic distribution network reconfiguration (DDNR), enabled by the sufficient number of remote switching devices in the distribution network, attempts to find the optimal topologies of the distribution network over the specified time interval. This paper proposes data-driven DDNR based on deep reinforcement learning (DRL). DRL-based DDNR controller aims to minimize the objective function, i.e. active energy losses and the cost of switching manipulations while satisfying the constraints. The following constraints are considered: allowed bus voltages, allowed line apparent powers, a radial network configuration with all buses being supplied, and the maximal allowed number of switching operations. This optimization problem is modelled as a Markov decision process by defining the possible states and actions of the DDNR agent (controller) and rewards that lead the agent to minimize the objective function while satisfying the constraints. Switching operation constraints are modelled by modifying the action space definition instead of including the additional penalty term in the reward function, to increase the computational efficiency. The proposed algorithm was tested on three test examples: small benchmark network, real-life large-scale test system and IEEE 33-bus radial system, and the results confirmed the robustness and scalability of the proposed algorithm.
ISSN:0948-7921
1432-0487
DOI:10.1007/s00202-021-01399-y