Loading…
Multi-agent reinforcement learning for resource allocation in IoT networks with edge computing
To support popular Internet of Things (IoT) applications such as virtual reality and mobile games, edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency and distributed data processing. However, it is challenging for multiple users to off...
Saved in:
Published in: | China communications 2020-09, Vol.17 (9), p.220-236 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | To support popular Internet of Things (IoT) applications such as virtual reality and mobile games, edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency and distributed data processing. However, it is challenging for multiple users to offload their computation tasks because they are competing for spectrum and computation as well as Radio Access Technologies (RAT) resources. In this paper, we investigate computation offloading mechanism of multiple selfish users with resource allocation in IoT edge computing networks by formulating it as a stochastic game. Each user is a learning agent observing its local network environment to learn optimal decisions on either local computing or edge computing with a goal of minimizing long term system cost by choosing its transmit power level, RAT and sub-channel without knowing any information of the other users. Since users' decisions are coupling at the gateway, we define the reward function of each user by considering the aggregated effect of other users. Therefore, a multi-agent reinforcement learning framework is developed to solve the game with the proposed Independent Learners based Multi-Agent Q-learning (IL-based MA-Q) algorithm. Simulations demonstrate that the proposed IL-based MA-Q algorithm is feasible to solve the formulated problem and is more energy efficient without extra cost on channel estimation at the centralized gateway. Finally, compared with the other three benchmark algorithms, it has better system cost performance and achieves distributed computation offloading. |
---|---|
ISSN: | 1673-5447 |
DOI: | 10.23919/JCC.2020.09.017 |