Jump to content

Draft:Datacenter Congestion Control

From Wikipedia, the free encyclopedia


Datacenter congestion control is the set of techniques and mechanisms used to manage network traffic within datacenters to prevent network congestion and guarantee efficient transmission. When multiple servers send data simultaneously through shared network infrastructure, congestion can occur. Congestion control algorithms determine how fast each sender should transmit data, when to slow down, and when it's safe to speed up again.

Importantly, datacenter congestion control algorithms operate in an environment that is fundamentally different than traditional internet congestion control, which is mostly handled using TCP (Transmission Control Protocol). TCP is designed for internet, which has high latency and unpredictable conditions. Datacenters network must meet much lower latency (micro seconds instead of milli seconds), high bandwidth, and can count on having more predictable network topologies. Datacenter congestion control mechanisms therefore must react much faster and more precisely than their internet counterparts. Also, since Round-trip times within a datacenter can be as low as a few microseconds, congestion can build up within microseconds.

Methods for Datacenter Congestion control

[edit]

Data Center TCP (DCTCP)

[edit]

DCTCP[1] takes a fundamentally different approach from traditional TCP. Instead of treating congestion as a binary event DCTCP provides multi-bit feedback about the extent of congestion. It leverages Explicit Congestion Notification (ECN), a feature where switches can mark packets when their queues exceed a certain threshold, rather than dropping them. The sender tracks the fraction of packets marked with ECN and adjusts transmission rate in a way that is proportional to congestion level.

TIMELY

[edit]

TIMELY[2] uses delay as the primary congestion signal. In datacenter networks, increases in round-trip time (RTT) correlate strongly with growing queue lengths at switches. TIMELY senders measures RTT at microsecond granularity and use a rate-based control algorithm to increase sending rate when RTT is low and stable.

DCQCN (Data Center Quantized Congestion Notification)

[edit]

DCDQN[3] is designed for RDMA over Converged Ethernet (RoCE) networks. It control transmission rates at the network interface card level,

and signals congestion by marking Explicit Congestion Notification (ECN). It also uses a feedback mechanism where receivers send explicit congestion notification packets back to senders. DCQCN reduces the rate immediately when congestion is detected, then gradually increases it like the TCP additive increase approach.

ADPG (Reinforcement Learning for Datacenter Congestion Control)

[edit]

ADPG[4], rather than designing explicit rules for adjusting rates, this approach uses a reinforcement learning (RL) algorithm to trains an agent that learns optimal congestion control policies through experience. The RL agent uses packet loss, latency measurements, and response patterns to select an action (raising or lowering the sending rate) that would lead to the best outcomes in terms of throughput and latency. This learning-based approach outperforms fixed rules by discovering complex control policies that are hard-to-find for human designers.

References

[edit]
  1. ^ Alizadeh, Mohammad; Greenberg, Albert; Maltz, David A.; Padhye, Jitendra; Patel, Parveen; Prabhakar, Balaji; Sengupta, Sudipta; Sridharan, Murari (2010-08-30). "Data center TCP (DCTCP)". SIGCOMM Comput. Commun. Rev. 40 (4): 63–74. doi:10.1145/1851275.1851192. ISSN 0146-4833.
  2. ^ Mittal, Radhika; Lam, Vinh The; Dukkipati, Nandita; Blem, Emily; Wassel, Hassan; Ghobadi, Monia; Vahdat, Amin; Wang, Yaogong; Wetherall, David; Zats, David (2015-08-17). "TIMELY: RTT-based Congestion Control for the Datacenter". SIGCOMM Comput. Commun. Rev. 45 (4): 537–550. doi:10.1145/2829988.2787510. ISSN 0146-4833.
  3. ^ Zhu, Yibo; Eran, Haggai; Firestone, Daniel; Guo, Chuanxiong; Lipshteyn, Marina; Liron, Yehonatan; Padhye, Jitendra; Raindel, Shachar; Yahia, Mohamad Haj; Zhang, Ming (2015-08-17). "Congestion Control for Large-Scale RDMA Deployments". SIGCOMM Comput. Commun. Rev. 45 (4): 523–536. doi:10.1145/2829988.2787484. ISSN 0146-4833.
  4. ^ Tessler, Chen; Shpigelman, Yuval; Dalal, Gal; Mandelbaum, Amit; Haritan Kazakov, Doron; Fuhrer, Benjamin; Chechik, Gal; Mannor, Shie (2022-01-20). "Reinforcement Learning for Datacenter Congestion Control". SIGMETRICS Perform. Eval. Rev. 49 (2): 43–46. doi:10.1145/3512798.3512815. ISSN 0163-5999.