Loading…
A Collaborative Multi-Agent Deep Reinforcement Learning-Based Wireless Power Allocation With Centralized Training and Decentralized Execution
Despite the success of Deep Reinforcement Learning (DRL) in radio-resource management within multi-cell wireless networks, applying it to power allocation in ultra-dense 5G and beyond networks poses challenges. While existing multi-agent DRL-based methods often adopt a fully centralized approach, th...
Saved in:
Published in: | IEEE transactions on communications 2024-11, Vol.72 (11), p.7006-7016 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Despite the success of Deep Reinforcement Learning (DRL) in radio-resource management within multi-cell wireless networks, applying it to power allocation in ultra-dense 5G and beyond networks poses challenges. While existing multi-agent DRL-based methods often adopt a fully centralized approach, they often overlook communication overhead costs. In this paper, we model a multi-cell network as a collaborative multi-agent DRL system, implementing a centralized training-decentralized execution approach for accurate and real-time decision-making, thereby eliminating communication overhead during execution. We carefully design the DRL agents' input observations, actions, and rewards to address potential impractical power allocation policies in multi-carrier systems and ensure strict compliance with transmit power constraints. Through extensive simulations, we assess the sensitivity of the proposed DRL-based power allocation to various exploration methods and system parameters. Results indicate superior performance of DRL-based power allocation with continuous action space in complex network environments. Conversely, simpler network settings with fewer subcarriers and users require fewer power allocation actions, ensuring rapid convergence. By leveraging a fast exploration rate, DRL-based power allocation with discrete action space outperforms conventional algorithms, achieving a 36% relative sum rate increase within 60,000 training episodes. |
---|---|
ISSN: | 0090-6778 1558-0857 |
DOI: | 10.1109/TCOMM.2024.3409530 |