Loading…
Revisiting network congestion avoidance through adaptive packet-chaining reservation
Endpoint congestion is a bottleneck in high-performance computing (HPC) networks, which severely impacts system performance, especially for latency-sensitive applications. When the long messages (or flows) has a far larger duration than the round-trip time (RTT), the proactive or reactive countermea...
Saved in:
Published in: | Computer networks (Amsterdam, Netherlands : 1999) Netherlands : 1999), 2022-07, Vol.212, p.109008, Article 109008 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Endpoint congestion is a bottleneck in high-performance computing (HPC) networks, which severely impacts system performance, especially for latency-sensitive applications. When the long messages (or flows) has a far larger duration than the round-trip time (RTT), the proactive or reactive countermeasures, an effective solution to endpoint solution, can control the injection rate within a proper range dynamically. However, many HPC applications produce hybrid traffic (a mix of short and long messages) and are dominated by short messages. Existing proactive congestion avoidance methods face the great challenge of scheduling the rapidly changing traffic caused by these short messages. In this paper, we first propose the Packet-Chaining Reservation Protocol (PCRP), that is, a novel congestion management strategy which leverages the advantages of proactive (scheduling the whole flow) and reactive (scheduling the single packet) congestion avoidance techniques. In fact, it is an eclectic method of scheduling. We select the chaining of packets as a flexible reservation granularity between the whole flow and a single packet. The PCRP allows small flows to be speculatively transmitted without being discarded. It also gives the small flows an appropriate priority based on the detected traffic conditions. The PCRP can make a quick respond to network conditions, effectively avoiding endpoint congestion and reducing the average flow delay. However, PCRP is only suitable for short-flow-dominant traffic, as it performs poorly when facing other traffic. This is because PCRP can starve longer flows and thus introduces the unbearable tail delay. Therefore, we further propose the Packet-Chaining Reservation Protocol with Adaptive Framework (PCRP+), which is a reservation protocol that can flexibly adjust the scheduling strategy according to the network load. The PCRP+ can achieve and maintain low tail delay and desirable universality by adopting the adaptive framework. We conduct extensive experiments to evaluate the PCRP+ and compare it with the Speculative Reservation Protocol (SRP) and Bilateral Flow Reservation Protocol (BFRP), the two most typical proactive reservation-based protocols. Evaluation results demonstrate that our design can reduce the flow latency by an average of 25.71%, 21.21%, and 29.01% for hotspot traffic, uniform traffic, and GPCNeT traffic, respectively.
•PCRP selects the packet chaining as the efficient granularity of the reservation.•Speculative pack |
---|---|
ISSN: | 1389-1286 1872-7069 |
DOI: | 10.1016/j.comnet.2022.109008 |