Loading…

Dynamic holding control to avoid bus bunching: A multi-agent deep reinforcement learning framework

•A multi-agent deep reinforcement learning framework is proposed for bus holding control.•A reward function is defined to achieve headway self-equalization.•The action of each agent is considered by introducing a joint-action tracker.•A scheme based on proximal policy optimization is designed to tra...

Full description

Saved in:
Bibliographic Details
Published in:Transportation research. Part C, Emerging technologies Emerging technologies, 2020-07, Vol.116, p.102661, Article 102661
Main Authors: Wang, Jiawei, Sun, Lijun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A multi-agent deep reinforcement learning framework is proposed for bus holding control.•A reward function is defined to achieve headway self-equalization.•The action of each agent is considered by introducing a joint-action tracker.•A scheme based on proximal policy optimization is designed to train the agents.•The framework outperforms other baselines in simulation studies. Bus bunching has been a long-standing problem that undermines the efficiency and reliability of public transport services. The most popular countermeasure in practice is to introduce static and dynamic holding control. However, most previous holding control strategies mainly consider local information with a pre-specified headway/schedule, while the global coordination of the whole bus fleet and its long-term effect are often overlooked. To efficiently incorporate global coordination and long-term operation in bus holding, in this paper we propose a multi-agent deep reinforcement learning (MDRL) framework to develop dynamic and flexible holding control strategies for a bus route. Specifically, we model each bus as an agent that interacts with not only its leader/follower but also all other vehicles in the fleet. To better explore potential strategies, we develop an effective headway-based reward function in the proposed framework. In the learning framework, we model fleet coordination by using a basic actor-critic scheme along with a joint action tracker to better characterize the complex interactions among agents in policy learning, and we apply proximal policy optimization to improve learning performance. We conduct extensive numerical experiments to evaluate the proposed MDRL framework against multiple baseline models that only rely on local information. Our results demonstrate the superiority of the proposed framework and show the promise of applying MDRL in the coordinative control of public transport vehicle fleets in real-world operations.
ISSN:0968-090X
1879-2359
DOI:10.1016/j.trc.2020.102661