Loading…

Reinforcement Learning for Optimizing Delay-Sensitive Task Offloading in Vehicular Edge-Cloud Computing

With the appearance of more and more devices connected to the Internet, the world has witnessed an ever-growing number of data to be processed. Among those, many tasks require swift execution time, while the storage and computation capability of Internet of Things (IoT) devices are limited. To addre...

Full description

Saved in:
Bibliographic Details
Published in:IEEE internet of things journal 2024-01, Vol.11 (2), p.2058-2069
Main Authors: Binh, Ta Huu, Son, Do Bao, Vo, Hiep, Nguyen, Binh Minh, Binh, Huynh Thi Thanh
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the appearance of more and more devices connected to the Internet, the world has witnessed an ever-growing number of data to be processed. Among those, many tasks require swift execution time, while the storage and computation capability of Internet of Things (IoT) devices are limited. To address the demands of delay-sensitive tasks, we present a vehicular edge-cloud computing (VECC) network that leverages powerful computation capabilities through the deployment of servers in proximity to task-generated devices, as well as the utilization of idle resources from smart vehicles to share the workload. Because these limited resources are vulnerable to sudden data arising, it is imperative to incorporate cloud servers to prevent system overload. The challenge now is to find a task offloading strategy that collaborates both edges and cloud resources to minimize the total time surpassing the quality baseline of each task (tolerance time) and make all tasks meet their soft deadlines of quality. To reach this goal, we first model the task offloading problem in VECC as a Markov decision process (MDP). Then, we propose advantage-oriented task offloading with a dueling actor-insulator network scheme to solve the problem. This value-based reinforcement learning (RL) method helps the agent find an effective policy when not knowing all the state attributes changes. The effectiveness of our method is demonstrated by performance evaluations based on real-world bus traces in Rio de Janeiro (Brazil). The experimental results show that our proposal reduces the tolerance time by at least 8.81% compared to other RL algorithms and 75% compared to greedy approaches.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2023.3292591