TY - GEN
T1 - Breaking the transience-equilibrium nexus
T2 - 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021
AU - Liu, Shiyu
AU - Ghalayini, Ahmad
AU - Alizadeh, Mohammad
AU - Prabhakar, Balaji
AU - Rosenblum, Mendel
AU - Sivaraman, Anirudh
N1 - Publisher Copyright:
© 2021 by The USENIX Association.
PY - 2021
Y1 - 2021
N2 - Recent datacenter transport protocols rely heavily on rich congestion signals from the network, impeding their deployment in environments such as the public cloud. In this paper, we explain this trend by showing that, without rich congestion signals, there is a strong tradeoff between a packet transport's equilibrium and transience performance. We then propose a simple approach to resolve this tension without complicating the transport protocol and without rich congestion signals from the network. Our approach factors the transport into two separate components for equilibrium and transient handling. For equilibrium handling, we continue to use existing congestion control protocols. For transients, we develop a new underlay algorithm, On-Ramp, which intercepts and holds any protocol's packets at the network edge during transient overload. On-Ramp detects transient overloads using accurate measurements of one-way delay, made possible in software by a recently developed time-synchronization algorithm. On the Google Cloud Platform, On-Ramp improves the 99th percentile request completion time (RCT) of incast traffic of CUBIC by 2.8× and BBR by 5.6×. In a bare-metal cloud (CloudLab), On-Ramp improves the RCT of CUBIC by 4.1×. In ns-3 simulations, which model more efficient NIC-based implementations of On-Ramp, On-Ramp improves RCTs of DCQCN, TIMELY, DCTCP and HPCC to varying degrees depending on the workload. In all three environments, On-Ramp also improves the flow completion time of non-incast background traffic. In an evaluation at Facebook, On-Ramp significantly reduces the latency of computing traffic while ensuring the throughput of storage traffic is not affected.
AB - Recent datacenter transport protocols rely heavily on rich congestion signals from the network, impeding their deployment in environments such as the public cloud. In this paper, we explain this trend by showing that, without rich congestion signals, there is a strong tradeoff between a packet transport's equilibrium and transience performance. We then propose a simple approach to resolve this tension without complicating the transport protocol and without rich congestion signals from the network. Our approach factors the transport into two separate components for equilibrium and transient handling. For equilibrium handling, we continue to use existing congestion control protocols. For transients, we develop a new underlay algorithm, On-Ramp, which intercepts and holds any protocol's packets at the network edge during transient overload. On-Ramp detects transient overloads using accurate measurements of one-way delay, made possible in software by a recently developed time-synchronization algorithm. On the Google Cloud Platform, On-Ramp improves the 99th percentile request completion time (RCT) of incast traffic of CUBIC by 2.8× and BBR by 5.6×. In a bare-metal cloud (CloudLab), On-Ramp improves the RCT of CUBIC by 4.1×. In ns-3 simulations, which model more efficient NIC-based implementations of On-Ramp, On-Ramp improves RCTs of DCQCN, TIMELY, DCTCP and HPCC to varying degrees depending on the workload. In all three environments, On-Ramp also improves the flow completion time of non-incast background traffic. In an evaluation at Facebook, On-Ramp significantly reduces the latency of computing traffic while ensuring the throughput of storage traffic is not affected.
UR - http://www.scopus.com/inward/record.url?scp=85106189815&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85106189815&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85106189815
T3 - Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021
SP - 47
EP - 61
BT - Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021
PB - USENIX Association
Y2 - 12 April 2021 through 14 April 2021
ER -