Loading…

Decentralized learning for traffic signal control

In this paper, we study the problem of obtaining the optimal order of the phase sequence [14] in a road network for efficiently managing the traffic flow. We model this problem as a Markov decision process (MDP). This problem is hard to solve when simultaneously considering all the junctions in the...

Full description

Saved in:
Bibliographic Details
Main Authors: Prabuchandran, K. J., Hemanth Kumar, A. N., Bhatnagar, Shalabh
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we study the problem of obtaining the optimal order of the phase sequence [14] in a road network for efficiently managing the traffic flow. We model this problem as a Markov decision process (MDP). This problem is hard to solve when simultaneously considering all the junctions in the road network. So, we propose a decentralized multi-agent reinforcement learning (MARL) algorithm for solving this problem by considering each junction in the road network as a separate agent (controller). Each agent optimizes the order of the phase sequence using Q-learning with either ∈-greedy or UCB [3] based exploration strategies. The coordination between the junctions is achieved based on the cost feedback signal received from the neighbouring junctions. The learning algorithm for each agent updates the Q-factors using this feedback signal. We show through simulations over VISSIM that our algorithms perform significantly better than the standard fixed signal timing (FST), the saturation balancing (SAT) [14] and the round-robin multi-agent reinforcement learning algorithms [11] over two real road networks.
ISSN:2155-2487
2155-2509
DOI:10.1109/COMSNETS.2015.7098712