Loading…

Real-world ride-hailing vehicle repositioning using deep reinforcement learning

•A real-world deployed reinforcement learning-based algorithm for ride-hailing vehicle repositioning.•A practical framework incorporating offline learning and online decision-time planning.•Effective algorithmic designs for small-fleet and large-fleet scenarios. We present a new practical framework...

Full description

Saved in:
Bibliographic Details
Published in:Transportation research. Part C, Emerging technologies Emerging technologies, 2021-09, Vol.130, p.103289, Article 103289
Main Authors: Jiao, Yan, Tang, Xiaocheng, Qin, Zhiwei (Tony), Li, Shuaiji, Zhang, Fan, Zhu, Hongtu, Ye, Jieping
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A real-world deployed reinforcement learning-based algorithm for ride-hailing vehicle repositioning.•A practical framework incorporating offline learning and online decision-time planning.•Effective algorithmic designs for small-fleet and large-fleet scenarios. We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle repositioning on ride-hailing (a type of mobility-on-demand, MoD) platforms. Our approach learns the spatiotemporal state-value function using a batch training algorithm with deep value networks. The optimal repositioning action is generated on-demand through value-based policy search, which combines planning and bootstrapping with the value networks. For the large-fleet problems, we develop several algorithmic features that we incorporate into our framework and that we demonstrate to induce coordination among the algorithmically-guided vehicles. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency measured by income-per-hour. We have also designed and run a real-world experiment program with regular drivers on a major ride-hailing platform. We have observed significantly positive results on key metrics comparing our method with experienced drivers who performed idle-time repositioning based on their own expertise.
ISSN:0968-090X
1879-2359
DOI:10.1016/j.trc.2021.103289