Loading…

Enabling Efficient Scheduling in Large-Scale UAV-Assisted Mobile-Edge Computing via Hierarchical Reinforcement Learning

Due to the high maneuverability and flexibility, unmanned aerial vehicles (UAVs) have been considered as a promising paradigm to assist mobile edge computing (MEC) in many scenarios including disaster rescue and field operation. Most existing research focuses on the study of trajectory and computati...

Full description

Saved in:
Bibliographic Details
Published in:IEEE internet of things journal 2022-05, Vol.9 (10), p.7095-7109
Main Authors: Ren, Tao, Niu, Jianwei, Dai, Bin, Liu, Xuefeng, Hu, Zheyuan, Xu, Mingliang, Guizani, Mohsen
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to the high maneuverability and flexibility, unmanned aerial vehicles (UAVs) have been considered as a promising paradigm to assist mobile edge computing (MEC) in many scenarios including disaster rescue and field operation. Most existing research focuses on the study of trajectory and computation-offloading scheduling for UAV-assisted MEC in stationary environments, and could face challenges in dynamic environments where the locations of UAVs and mobile devices (MDs) vary significantly. Some latest research attempts to develop scheduling policies for dynamic environments by means of reinforcement learning (RL). However, as these need to explore in high-dimensional state and action space, they may fail to cover in large-scale networks where multiple UAVs serve numerous MDs. To address this challenge, we leverage the idea of "divide-and-conquer" and propose HT3O, a scalable scheduling approach for large-scale UAV-assisted MEC. First, HT3O is built with neural networks via deep RL to obtain real-time scheduling policies for MEC in dynamic environments. More importantly, to make HT3O more scalable, we decompose the scheduling problem into two-layered subproblems and optimize them alternately via hierarchical RL. This not only substantially reduces the complexity of each subproblem, but also improves the convergence efficiency. Experimental results show that HT3O can achieve promising performance improvements over state-of-the-art approaches.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2021.3071531