/
HPCC++: Enhanced High Precision Congestion Control HPCC++: Enhanced High Precision Congestion Control

HPCC++: Enhanced High Precision Congestion Control - PowerPoint Presentation

adia
adia . @adia
Follow
27 views
Uploaded On 2024-02-03

HPCC++: Enhanced High Precision Congestion Control - PPT Presentation

d raftpantsvwghpccplus00 Rui Miao Hongqiang Harry Liu Rong Pan Jeongkeun Lee Changhoon Kim Barak Gafni Yuval Shpigelman IETF108 tsvwg July 2020 1 Cloud desires hyperspeed networking ID: 1044747

hpcc compute amp storage compute hpcc storage amp congestion hyper performance challenge network precise speed heuristics resources band enhanced

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "HPCC++: Enhanced High Precision Congesti..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. HPCC++: Enhanced High Precision Congestion Controldraft-pan-tsvwg-hpccplus-00Rui Miao, Hongqiang Harry Liu, Rong Pan, Jeongkeun Lee, Changhoon Kim, Barak Gafni, Yuval ShpigelmanIETF-108 tsvwgJuly 20201

2. Cloud desires hyper-speed networking2High-performance storage Storage-compute separation is normHDDSSDNVMeHigher-throughput, lower latency1M IOPS / 50~100usDistributed deep learning, HPCCPUGPU, FPGA, ASICFaster compute, lower latencyE.g. latency <10usHigh-performance computationMore network loadNeed ultra-lower latency: 3-5us, > 40Gbps (Gao Et.al. OSDI’16)Resource disaggregation bigger data to compute & storefaster compute & storage devicesToday, clouds have more types of compute and storage resources

3. Hyper-speed network chips != hyper-speed networking3AppAppNetwork StackNetwork StackHardware-offloading (e.g., RDMA)Traditional software-based networking stacks cannot keep with the speedHostHostNetwork FabricCongestion control (CC)Since, end hosts are aggressive, network is more vulnerable to congestion & packet loss

4. Realistic challenges in current CC in RDMA networksOperation challenge-1: PFC storm & deadlockDisabling PFC causes bad performance!!!Operation challenge-2: running multiple applicationsQoS queues are scarce resources!!!Operation challenge-3: complex parameter tuningDCQCN has at least 15 parameters to tune!!!4Challenge-1: Slow ConvergenceChallenge-2: Standing queueChallenge-3: Heuristics in CCChallenges in current CC

5. HPCC++: Enhanced High Precision Congestion Control (SIGCOMM’19)New commodity ASICs have in-band telemetry abilityUse in-band telemetry as precise feedback for congestion control5Notification pkts/ACKsAdjust rate per ACKpktpktpktTelemetrySenderReceiverLink-1Link-2Telemetry

6. HPCC solves the 3 problems6Using INT as the precise feedbackFast convergenceSender knows the precise rate to adjust to, on every ACKNear-zero queueFeedback does not rely on queueFew parametersPrecise feedback, so no need for heuristics which requires many parameters

7. In testbed, vs. DCQCN (hardware-based, widely used in industry)Web search traffic at 50% loadVs. other CC (unavailable in HW) in simulation. HPCC performs betterHPCC++ achieves lower FCT and near-zero queue795% lower55% lowerHPCC++ - P99: 23KB (7us queue delay)

8. Thank You8