Breaking the transience-equilibrium nexus: A new approach to datacenter packet transport

Shiyu Liu, Ahmad Ghalayini, Mohammad Alizadeh, Balaji Prabhakar, Mendel Rosenblum, Anirudh Sivaraman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent datacenter transport protocols rely heavily on rich congestion signals from the network, impeding their deployment in environments such as the public cloud. In this paper, we explain this trend by showing that, without rich congestion signals, there is a strong tradeoff between a packet transport's equilibrium and transience performance. We then propose a simple approach to resolve this tension without complicating the transport protocol and without rich congestion signals from the network. Our approach factors the transport into two separate components for equilibrium and transient handling. For equilibrium handling, we continue to use existing congestion control protocols. For transients, we develop a new underlay algorithm, On-Ramp, which intercepts and holds any protocol's packets at the network edge during transient overload. On-Ramp detects transient overloads using accurate measurements of one-way delay, made possible in software by a recently developed time-synchronization algorithm. On the Google Cloud Platform, On-Ramp improves the 99th percentile request completion time (RCT) of incast traffic of CUBIC by 2.8× and BBR by 5.6×. In a bare-metal cloud (CloudLab), On-Ramp improves the RCT of CUBIC by 4.1×. In ns-3 simulations, which model more efficient NIC-based implementations of On-Ramp, On-Ramp improves RCTs of DCQCN, TIMELY, DCTCP and HPCC to varying degrees depending on the workload. In all three environments, On-Ramp also improves the flow completion time of non-incast background traffic. In an evaluation at Facebook, On-Ramp significantly reduces the latency of computing traffic while ensuring the throughput of storage traffic is not affected.

Original languageEnglish (US)
Title of host publicationProceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021
PublisherUSENIX Association
Pages47-61
Number of pages15
ISBN (Electronic)9781939133212
StatePublished - 2021
Event18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021 - Virtual, Online
Duration: Apr 12 2021Apr 14 2021

Publication series

NameProceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021

Conference

Conference18th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2021
CityVirtual, Online
Period4/12/214/14/21

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Breaking the transience-equilibrium nexus: A new approach to datacenter packet transport'. Together they form a unique fingerprint.

Cite this