Loading…

Combinatorial Optimization Meets Reinforcement Learning: Effective Taxi Order Dispatching at Large-Scale

Ride hailing has become prevailing. Central in ride hailing platforms is taxi order dispatching which involves recommending a suitable driver for each order. Previous works use pure combinatorial optimization solutions for taxi dispatching, which suffer in practice due to complex dynamics of demand...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2023-10, Vol.35 (10), p.9812-9823
Main Authors: Tong, Yongxin, Shi, Dingyuan, Xu, Yi, Lv, Weifeng, Qin, Zhiwei, Tang, Xiaocheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ride hailing has become prevailing. Central in ride hailing platforms is taxi order dispatching which involves recommending a suitable driver for each order. Previous works use pure combinatorial optimization solutions for taxi dispatching, which suffer in practice due to complex dynamics of demand and supply and temporal dependency among dispatching decisions. Recent studies try to adopt data-driven method into combinatorial optimization hoping knowledge from history data would help overcome these challenges. Among these attempts, adoption of reinforcement learning shows great promise but current adoptions are a unidirectional integration which restricts the potential performance gains. In this work, we propose L earning T o D ispatch(LTD), a systematic solution that allows synergic integration of reinforcement learning and combinatorial optimization for large-scale taxi order dispatching. We demonstrate the necessity of online learning and taxi scheduling for reinforcement learning to work in synergy with combinatorial optimization, and devise corresponding algorithms. We also devise many tricks for more efficient calculation of the bipartite matching. Experiments show our methods can improve 36.4\% 36.4% and 42.0\% 42.0% on utility and efficiency at most, respectively. Especially, it achieves state-of-the-art performance in terms of utility.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2021.3127077