Loading…

Equilibrated and Fast Resources Allocation for Massive and Diversified MTC Services Using Multiagent Deep Reinforcement Learning

Massive and diversified machine type communication (MTC) service is one of the development trends of MTC in Internet of Things (IoT). Meanwhile, realizing network functions virtualization (NFV) is inseparable from reasonable virtual network function (VNF) scheduling and resource allocation. For VNF...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE internet of things journal 2023-01, Vol.10 (1), p.664-681
Main Authors:	Tang, Lun, Du, Yucong, Chen, Qianbin, Liu, Qinghai, Li, Jinyu, Li, Shirui
Format:	Article
Language:	English
Subjects:	Algorithms Continuity (mathematics) Deep learning Internet of Things Internet of Things (IoT) Job shop scheduling Machine learning Machine learning algorithms machine type communication (MTC) services Mapping mapping scheme multiagent deep reinforcement learning (DRL) Multiagent systems Optimization Reinforcement learning Resource allocation Resource management Resource scheduling service equilibrium Training Variables Virtual networks
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Massive and diversified machine type communication (MTC) service is one of the development trends of MTC in Internet of Things (IoT). Meanwhile, realizing network functions virtualization (NFV) is inseparable from reasonable virtual network function (VNF) scheduling and resource allocation. For VNF scheduling and resource allocation of MTC services, recently, deep reinforcement learning (DRL) has become one of the feasible solutions. However, existing DRL solutions have problems of inapplicability to the environment with both discrete and continuous variables, long training, time and nonequilibrium resource allocation. In this article, we first model the end-to-end (E2E) VNF scheduling and resource allocation of core network nodes, links, and access network subcarriers with different strategies, respectively, and propose a compound variable optimization problem aiming at maximizing the net income of the network provider. Then, we propose the mapping scheme of the absolute value of the signum function (ASgn mapping scheme) to simplify the compound variables into continuous variables of the optimization problem, so that the DRL algorithm is applicable. Moreover, we propose a model paralleling multiagent twin delayed deep deterministic (MPMA-TD3) policy gradient algorithm to handle massive services, reduce training time, and action space of agents. Finally, we improve the MPMA-TD3 algorithm to handle diversified services, solve the problem of nonequilibrium resources allocation, and realize the reasonable resource allocation for each service. Simulation results show that the proposed algorithms are better than other algorithms in reward, delay, cost, and training time for massive services. Further, the Improved MPMA-TD3 algorithm has the best service equilibrating ability.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2022.3204359