Loading…

A Collaborative Multi-Agent Deep Reinforcement Learning-Based Wireless Power Allocation With Centralized Training and Decentralized Execution

Despite the success of Deep Reinforcement Learning (DRL) in radio-resource management within multi-cell wireless networks, applying it to power allocation in ultra-dense 5G and beyond networks poses challenges. While existing multi-agent DRL-based methods often adopt a fully centralized approach, th...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on communications 2024-11, Vol.72 (11), p.7006-7016
Main Authors:	Kopic, Amna, Perenda, Erma, Gacanin, Haris
Format:	Article
Language:	English
Subjects:	Base stations collaborative deep reinforcement learning Convergence Mathematical models multi-carrier systems power allocation Power control Resource management Reward and feature design Training Wireless networks
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Despite the success of Deep Reinforcement Learning (DRL) in radio-resource management within multi-cell wireless networks, applying it to power allocation in ultra-dense 5G and beyond networks poses challenges. While existing multi-agent DRL-based methods often adopt a fully centralized approach, they often overlook communication overhead costs. In this paper, we model a multi-cell network as a collaborative multi-agent DRL system, implementing a centralized training-decentralized execution approach for accurate and real-time decision-making, thereby eliminating communication overhead during execution. We carefully design the DRL agents' input observations, actions, and rewards to address potential impractical power allocation policies in multi-carrier systems and ensure strict compliance with transmit power constraints. Through extensive simulations, we assess the sensitivity of the proposed DRL-based power allocation to various exploration methods and system parameters. Results indicate superior performance of DRL-based power allocation with continuous action space in complex network environments. Conversely, simpler network settings with fewer subcarriers and users require fewer power allocation actions, ensuring rapid convergence. By leveraging a fast exploration rate, DRL-based power allocation with discrete action space outperforms conventional algorithms, achieving a 36% relative sum rate increase within 60,000 training episodes.
ISSN:	0090-6778 1558-0857
DOI:	10.1109/TCOMM.2024.3409530