Loading…

A Collaborative Multi-Agent Deep Reinforcement Learning-Based Wireless Power Allocation With Centralized Training and Decentralized Execution

Despite the success of Deep Reinforcement Learning (DRL) in radio-resource management within multi-cell wireless networks, applying it to power allocation in ultra-dense 5G and beyond networks poses challenges. While existing multi-agent DRL-based methods often adopt a fully centralized approach, th...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on communications 2024-11, Vol.72 (11), p.7006-7016
Main Authors: Kopic, Amna, Perenda, Erma, Gacanin, Haris
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Despite the success of Deep Reinforcement Learning (DRL) in radio-resource management within multi-cell wireless networks, applying it to power allocation in ultra-dense 5G and beyond networks poses challenges. While existing multi-agent DRL-based methods often adopt a fully centralized approach, they often overlook communication overhead costs. In this paper, we model a multi-cell network as a collaborative multi-agent DRL system, implementing a centralized training-decentralized execution approach for accurate and real-time decision-making, thereby eliminating communication overhead during execution. We carefully design the DRL agents' input observations, actions, and rewards to address potential impractical power allocation policies in multi-carrier systems and ensure strict compliance with transmit power constraints. Through extensive simulations, we assess the sensitivity of the proposed DRL-based power allocation to various exploration methods and system parameters. Results indicate superior performance of DRL-based power allocation with continuous action space in complex network environments. Conversely, simpler network settings with fewer subcarriers and users require fewer power allocation actions, ensuring rapid convergence. By leveraging a fast exploration rate, DRL-based power allocation with discrete action space outperforms conventional algorithms, achieving a 36% relative sum rate increase within 60,000 training episodes.
ISSN:0090-6778
1558-0857
DOI:10.1109/TCOMM.2024.3409530