Loading…

Deep Transfer Reinforcement Learning for Beamforming and Resource Allocation in Multi-Cell MISO-OFDMA Systems

Orthogonal frequency division multiple access (OFDMA) is one of the promising technologies to satisfy the huge access demand and high data-rate requirement of the fifth generation (5G) networks. In this paper, we study the joint beamforming coordination and resource allocation in the downlink multi-...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on signal and information processing over networks 2022, Vol.8, p.815-829
Main Authors:	Wang, Xiaoming, Sun, Gaoxiang, Xin, Yuanxue, Liu, Ting, Xu, Youyun
Format:	Article
Language:	English
Subjects:	Algorithms Array signal processing Beamforming Coordination Distillation Downlink Frequency division multiple access Knowledge management Machine learning MADQN MISO MISO (control systems) Modules multi-cell Multiagent systems Neural networks OFDMA Orthogonal Frequency Division Multiplexing Resource allocation Resource management Training Transfer learning Wireless communication Wireless networks
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Orthogonal frequency division multiple access (OFDMA) is one of the promising technologies to satisfy the huge access demand and high data-rate requirement of the fifth generation (5G) networks. In this paper, we study the joint beamforming coordination and resource allocation in the downlink multi-cell multiple-input single-output OFDMA (MISO-OFDMA) systems. First, we divide the allocation framework into beamforming coordination and power allocation (BCPA) module and subcarrier allocation (SA) module. Then, we design a multi-agent deep Q-network (MADQN) algorithm for the allocation framework. Furthermore, we propose a MADQN-based transfer learning framework using knowledge distillation, which is called transfer learning-MADQN (TL-MADQN), to improve the adaptability of neural networks for different wireless schemes. TL-MADQN exploits neural networks and their parameters distilled from pre-trained agents and the experience collected from new agents so that the new agents complete their training process effectively and quickly in the new network environment. Finally, we adjust the allocation policy to maximize the sum data-rate for all users by updating the weights of each neural network. Simulation results show that the proposed MADQN algorithm achieves better performance than the baseline algorithms. Moreover, our TL-MADQN framework further improves the convergence speed and data-rate, which validates its effectiveness and superiority.
ISSN:	2373-776X 2373-776X 2373-7778
DOI:	10.1109/TSIPN.2022.3208432