Loading…

Deep Reinforcement Learning-Based Resource Allocation and Power Control in Small Cells With Limited Information Exchange

In multi-user downlink small cell networks, cooperative resource allocation (RA) within a small cell cluster is a key technique to enhance network capacity. However, capacity-maximizing RA in frequency-selective fading channels requires global channel state information (CSI) of users within a small...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on vehicular technology 2020-11, Vol.69 (11), p.13768-13783
Main Authors: Jang, Jonggyu, Yang, Hyun Jong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In multi-user downlink small cell networks, cooperative resource allocation (RA) within a small cell cluster is a key technique to enhance network capacity. However, capacity-maximizing RA in frequency-selective fading channels requires global channel state information (CSI) of users within a small cell cluster, which makes it infeasible in practical networks with limited direct link capacity. To circumvent this global CSI assumption, most of the existing studies on RA have been based on several CSI assumptions such as local CSI and local CSI at the transmitters (CSIT). Nevertheless, cost functions with local CSI or local CSIT in the literature rely on heuristic formulations, because the sum-rate cannot be computed if without global CSI. In this paper, we propose a deep reinforcement learning-based RA algorithm to maximize the sum-rate for any given limited information on instantaneous CSI or sum-rate at the previous period. The proposed scheme is not restricted to certain CSI assumptions, but attempts to find the best RA for any given information such as quantized local CSI and quantized local CSIT; thus, it is applicable to any given direct link capacity. The proposed algorithm is self-adaptive in time-varying channels, since it is not divided into training and test phases. We modify the target neural network (TNN) scheme to enhance the sum-rate and the convergence speed. Numerical simulations confirm that: i) the proposed algorithm outperforms the conventional algorithms even under the same CSI assumption such as local CSI and local CSIT; ii) a flexible trade-off between the amount of CSI and the sum-rate is realizable in practical systems.
ISSN:0018-9545
1939-9359
DOI:10.1109/TVT.2020.3027013