Loading…
Routing and Scheduling of Mobile Energy Storage System for Electricity Arbitrage Based on Two-Layer Deep Reinforcement Learning
The mobile energy storage system (MESS) plays an increasingly important role in energy systems because of its spatial and temporal flexibilities, while the high upfront investment cost requires developing corresponding operation and arbitrage strategies. In the existing literature, the MESS arbitrag...
Saved in:
Published in: | IEEE transactions on transportation electrification 2023-03, Vol.9 (1), p.1087-1102 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The mobile energy storage system (MESS) plays an increasingly important role in energy systems because of its spatial and temporal flexibilities, while the high upfront investment cost requires developing corresponding operation and arbitrage strategies. In the existing literature, the MESS arbitrage problems are usually cast as mixed-integer programming models. However, the performance of this model-based method is deteriorated by the uncertainties of power and transportation networks and the complicated operational characteristics of batteries. To overcome the deficiencies of existing methods, this article proposes a data-driven uncertainty-adaptive MESS arbitrage method considering MESS mobility rules, battery degradation, and operational efficiencies. A two-layer deep reinforcement learning (DRL) method is developed to obtain the discrete mobility and continuous charging or discharging power, and a sequential training strategy is designed to accelerate the convergence of model training. The proposed method is tested using the real-world electricity prices and traffic information of charging stations. Compared with traditional model-based methods that rely on entire and accurate future information, the proposed DRL method obtains high arbitrage profits by learning arbitrage strategies from historical data and making effective decisions with limited real-time information. |
---|---|
ISSN: | 2332-7782 2577-4212 2332-7782 |
DOI: | 10.1109/TTE.2022.3201164 |