Loading…
Train timetabling with the general learning environment and multi-agent deep reinforcement learning
•A novel multi-agent deep reinforcement learning method for solving the train timetabling problem.•A general environment that captures the system dynamics of the single-track and double-track railway systems.•A multi-agent actor-critic algorithm framework of centralized training and decentralized ex...
Saved in:
Published in: | Transportation research. Part B: methodological 2022-03, Vol.157, p.230-251 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A novel multi-agent deep reinforcement learning method for solving the train timetabling problem.•A general environment that captures the system dynamics of the single-track and double-track railway systems.•A multi-agent actor-critic algorithm framework of centralized training and decentralized execution.•A case study demonstrating advantages of the proposed method over traditional counterparts .
This paper proposes a multi-agent deep reinforcement learning approach for the train timetabling problem of different railway systems. A general train timetabling learning environment is constructed to model the problem as a Markov decision process, in which the objectives and complex constraints of the problem can be distributed naturally and elegantly. Through subtle changes, the environment can be flexibly switched between the widely used double-track railway system and the more complex single-track railway system. To address the curse of dimensionality, a multi-agent actor–critic algorithm framework is proposed to decompose the large-size combinatorial decision space into multiple independent ones, which are parameterized by deep neural networks. The proposed approach was tested using a real-world instance and several test instances. Experimental results show that cooperative policies of the single-track train timetabling problem can be obtained by the proposed method within a reasonable computing time that outperforms several prevailing methods in terms of the optimality of solutions, and the proposed method can be easily generalized to the double-track train timetabling problem by changing the environment slightly. |
---|---|
ISSN: | 0191-2615 1879-2367 |
DOI: | 10.1016/j.trb.2022.02.006 |