Loading…
An Enhancement Framework for RDMA Congestion Control in Multi-tenant Datacenters
Recently, Remote Direct Memory Access (RDMA) is gradually gaining popularity in multi-tenant datacenters due to its high performance and low CPU utilization. RDMA requires a lossless underlying network to fully realize its potential. Thus, hop-by-hop Priority Flow Control (PFC) is deployed to ensure...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recently, Remote Direct Memory Access (RDMA) is gradually gaining popularity in multi-tenant datacenters due to its high performance and low CPU utilization. RDMA requires a lossless underlying network to fully realize its potential. Thus, hop-by-hop Priority Flow Control (PFC) is deployed to ensure losslessness, and congestion control is also needed to allocate per-flow bandwidth. However, we find that existing RDMA congestion control schemes in multi-tenant datacenters fail to achieve rapid rate convergence due to the heuristic nature and may exacerbate PFC side effects including head-of-line blocking, unfairness, and even deadlocks. In this paper, we propose an enhancement framework for RDMA congestion control called RDI. RDI has several noteworthy properties. First, RDI utilizes valuable receiver-side information and network congestion level to provide a precise guide rate for congested flows, thus alleviating congestion and corresponding PFC issues; Second, RDI is transparent to tenants and compatible with existing RDMA network architectures without the need of modifying in-network devices. We evaluate RDI under realistic traffic traces. The results show that the congestion control scheme enhanced by RDI significantly outperforms the original in terms of throughput and flow completion time, while also reducing the side effects of PFC. For instance, RDI-enhanced congestion control shortens by up to 44% and 60% for the 99th-percentile tail and average FCT, respectively. |
---|---|
ISSN: | 2374-9709 |
DOI: | 10.1109/NOMS59830.2024.10574987 |