Loading…
Equilibrated and Fast Resources Allocation for Massive and Diversified MTC Services Using Multiagent Deep Reinforcement Learning
Massive and diversified machine type communication (MTC) service is one of the development trends of MTC in Internet of Things (IoT). Meanwhile, realizing network functions virtualization (NFV) is inseparable from reasonable virtual network function (VNF) scheduling and resource allocation. For VNF...
Saved in:
Published in: | IEEE internet of things journal 2023-01, Vol.10 (1), p.664-681 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Massive and diversified machine type communication (MTC) service is one of the development trends of MTC in Internet of Things (IoT). Meanwhile, realizing network functions virtualization (NFV) is inseparable from reasonable virtual network function (VNF) scheduling and resource allocation. For VNF scheduling and resource allocation of MTC services, recently, deep reinforcement learning (DRL) has become one of the feasible solutions. However, existing DRL solutions have problems of inapplicability to the environment with both discrete and continuous variables, long training, time and nonequilibrium resource allocation. In this article, we first model the end-to-end (E2E) VNF scheduling and resource allocation of core network nodes, links, and access network subcarriers with different strategies, respectively, and propose a compound variable optimization problem aiming at maximizing the net income of the network provider. Then, we propose the mapping scheme of the absolute value of the signum function (ASgn mapping scheme) to simplify the compound variables into continuous variables of the optimization problem, so that the DRL algorithm is applicable. Moreover, we propose a model paralleling multiagent twin delayed deep deterministic (MPMA-TD3) policy gradient algorithm to handle massive services, reduce training time, and action space of agents. Finally, we improve the MPMA-TD3 algorithm to handle diversified services, solve the problem of nonequilibrium resources allocation, and realize the reasonable resource allocation for each service. Simulation results show that the proposed algorithms are better than other algorithms in reward, delay, cost, and training time for massive services. Further, the Improved MPMA-TD3 algorithm has the best service equilibrating ability. |
---|---|
ISSN: | 2327-4662 2327-4662 |
DOI: | 10.1109/JIOT.2022.3204359 |