Loading…
Deep Reinforcement Learning-Based Resource Allocation and Power Control in Small Cells With Limited Information Exchange
In multi-user downlink small cell networks, cooperative resource allocation (RA) within a small cell cluster is a key technique to enhance network capacity. However, capacity-maximizing RA in frequency-selective fading channels requires global channel state information (CSI) of users within a small...
Saved in:
Published in: | IEEE transactions on vehicular technology 2020-11, Vol.69 (11), p.13768-13783 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In multi-user downlink small cell networks, cooperative resource allocation (RA) within a small cell cluster is a key technique to enhance network capacity. However, capacity-maximizing RA in frequency-selective fading channels requires global channel state information (CSI) of users within a small cell cluster, which makes it infeasible in practical networks with limited direct link capacity. To circumvent this global CSI assumption, most of the existing studies on RA have been based on several CSI assumptions such as local CSI and local CSI at the transmitters (CSIT). Nevertheless, cost functions with local CSI or local CSIT in the literature rely on heuristic formulations, because the sum-rate cannot be computed if without global CSI. In this paper, we propose a deep reinforcement learning-based RA algorithm to maximize the sum-rate for any given limited information on instantaneous CSI or sum-rate at the previous period. The proposed scheme is not restricted to certain CSI assumptions, but attempts to find the best RA for any given information such as quantized local CSI and quantized local CSIT; thus, it is applicable to any given direct link capacity. The proposed algorithm is self-adaptive in time-varying channels, since it is not divided into training and test phases. We modify the target neural network (TNN) scheme to enhance the sum-rate and the convergence speed. Numerical simulations confirm that: i) the proposed algorithm outperforms the conventional algorithms even under the same CSI assumption such as local CSI and local CSIT; ii) a flexible trade-off between the amount of CSI and the sum-rate is realizable in practical systems. |
---|---|
ISSN: | 0018-9545 1939-9359 |
DOI: | 10.1109/TVT.2020.3027013 |