CMSC 34702 ML for End-to-End Adaptation - PowerPoint Presentation

amey . @amey

66 views
Uploaded On 2023-08-31

CMSC 34702 ML for End-to-End Adaptation - PPT Presentation

Congestion Control Junchen Jiang October 8 2020 1 Congestion Control for High BandwidthDelay Product Networks TCP ex Machina ComputerGenerated Congestion Control 2 Background Based on the lecture from ID: 1015012

delay congestion throughput control congestion delay control throughput xcp mbps bandwidth packets plot scenario outgoing remycc windowincrement windowmin packet

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/1015012" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "CMSC 34702 ML for End-to-End Adaptation" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

1. CMSC 34702ML for End-to-End Adaptation(Congestion Control)Junchen JiangOctober 8, 20201

2. Congestion Control for High Bandwidth-Delay Product NetworksTCP ex Machina: Computer-Generated Congestion Control2

3. Background(Based on the lecture from https://www.youtube.com/watch?v=HxOA8HrVzPg)3

4. Layered view of the Internet4E2ENetwork(host naming, routing, etc)LinkReliabilityTimerSliding windowFlow controlSharing (Congestion control)

5. What is congestion control?5100Mbps100Mbps100Mbps100Mbps100Mbps100Mbps10MbpsLiC (Ideal) objective:

6. Why is it hard?6100Mbps100Mbps100Mbps100Mbps100Mbps100Mbps10MbpsLiC (Ideal) objective: To scale out, network must keep no per-flow state

7. Sharing a network using buffering7

8. Buffering  Congestion collapse8Sum of load () ThroughputC

9. Goals of congestion control9 Objective: Avoid congestion collapseReasonably high link utilizationFairnessDynamic N (# of competing flows)Wide range of C (link capacity)Delayed feedbackUnder

10. Congestion control framework10PacketAckSenderReceiverState: Estimation of RTT (round-trip time)Congestion window (# of outstanding packets)Decision: When to send out the next packet?

11. Example: AIMD11PacketAckSenderReceiverState: EWMA RTT: Estimation of RTT (round-trip time)Cwnd: Congestion window (# of outstanding packets)Decision: When to send out the next packet?AIMD logic:On Ack: Cwnd += 1; update ewma_rttOn packet loss/timeout: Cwnd /= 2; update ewma_rtt

12. Congestion Control for High Bandwidth-Delay Product Networks (sigcomm’02)First systematic approach to congestion control under high bw-delay product networks which were to dominate in the next decade.(Based on slides from R. Stallings, M. Handley and D. Katabi)12

13. TCP congestion control performs poorly as bandwidth or delay increases13Round Trip Delay (sec)Avg. TCP UtilizationBottleneck Bandwidth (Mb/s)Avg. TCP UtilizationShown analytically in [Low01] and via simulationsBecause TCP lacks fast response Spare bandwidth is available  TCP increases by 1 pkt/RTT even if spare bandwidth is huge When a TCP starts, it increases exponentially  Too many drops  Flows ramp up by 1 pkt/RTT, taking forever to grab the large bandwidth50 flows in both directionsBuffer = BW x DelayRTT = 80 ms50 flows in both directionsBuffer = BW x DelayBW = 155 Mb/s

14. 14High Utilization; Small Queues; Few DropsBandwidth Allocation PolicySolution: Decouple Congestion Control from FairnessExample: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls bothKey reason: Coupled congestion control and fairnessA single mechanism controls both

15. 15Solution: Decouple Congestion Control from FairnessExample: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls bothKey reason: Coupled congestion control and fairnessA single mechanism controls bothHow does decoupling solve the problem? To control congestion: use MIMD which shows fast responseTo control fairness: use AIMD which converges to fairness

16. Nice properties of XCPImproved Congestion Control:Small queuesAlmost no dropsImproved FairnessScalable (no per-flow state)16

17. XCP: An eXplicit Control Protocol17 Congestion Controller Fairness Controller

18. Feedback Round Trip TimeCongestion WindowCongestion HeaderFeedback Round Trip TimeCongestion Window How does XCP Work?Feedback = + 0.1 packet

19. Feedback = + 0.1 packet Round Trip TimeCongestion WindowFeedback = - 0.3 packet How does XCP Work?

20. Congestion Window = Congestion Window + FeedbackRouters compute feedback without any per-flow state How does XCP Work?XCP extends ECN and CSFQ

21. How Does an XCP Router Compute the Feedback?21Congestion ControllerFairness ControllerGoal: Divides  between flows to converge to fairnessLooks at a flow’s state in Congestion Header Algorithm:If  > 0  Divide  equally between flowsIf  < 0  Divide  between flows proportionally to their current rates MIMD AIMDGoal: Matches input traffic to link capacity & drains the queueLooks at aggregate traffic & queueAlgorithm:Aggregate traffic changes by  ~ Spare Bandwidth ~ - Queue SizeSo,  =  davg Spare -  QueueCongestion ControllerFairness Controller

22. Getting the devil out of the details …22 =  davg Spare -  QueueTheorem: System converges to optimal utilization (i.e., stable) for any link bandwidth, delay, number of sources if:(Proof based on Nyquist Criterion)Congestion ControllerFairness ControllerNo Parameter Tuning Algorithm:If  > 0  Divide  equally between flowsIf  < 0  Divide  between flows proportionally to their current ratesNeed to estimate number of flows NRTTpkt : Round Trip Time in header Cwndpkt : Congestion Window in headerT: Counting IntervalNo Per-Flow State

23. XCP Remains Efficient as Bandwidth or Delay Increases23Avg. UtilizationAvg. UtilizationBottleneck Bandwidth (Mb/s)Utilization as a function of Delay XCP increases proportionally to spare bandwidth and  chosen to make XCP robust to delayUtilization as a function of Bandwidth Round Trip Delay (sec)

24. Traditional view of congestion controlWhat feedback from the network (routers)?Implicit: Ack, Timeout, Duplicated acks, RTT inflation, … (e.g., Linux Cubic)Explicit: Router-generated marks (e.g., XCP, ECN)How should end hosts react to feedback?AIMD: Arithmetic increase, multiplicative decrease (Reno)Growing cwnd as a cubic function of time (Linux Cubic)Explicitly modeling the network capacity (TFRC)Relying entirely on router-generated marks (e.g., XCP)24

25. Heavily depends on operational environments1980s-1990s: Local-Area NetworksReno, Tahoe, NewReno1990s-2000s: Diverse environmentsHigh bandwidth-delay product (wireless, satellite, high capacity WAN)XCP, Vegas, Cubic, Compound TCP, …2010s: DatacenterMicro-second RTT, 10s GB bandwidth, delay-sensitive apps, Hardware supportDCTCP, Deadline-driven TCP, …25

26. The march of congestion control mechanisms26Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdfWhy so many designs?

27. Goals of congestion control27 Objective: Avoid congestion collapseReasonably high link utilizationFairnessDynamic NWide range of CDelayed feedbackUnderProblem formulations are too vague!

28. Rational choice of scheme is challengingDifferent goals?Different assumptions about network?One scheme just plain better?28VS.

29. Networks constrained by a fuzzy idea of TCP’s assumptionsMask stochastic lossBufferbloatMask out-of-order deliveryNo parallel/multipath routing29Advice for Internet Subnetwork Designers (RFC 3819) is 21,000 words!Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf

30. Design rationale? (from XCP paper)30Congestion is not a binary variable, so congestion signaling should reflect the degree of congestion. … This [XCP] allows the senders to decrease their sending windows quickly when the bottleneck is highly congested, … . The resulting protocol is both more responsive and less oscillatory.A fundamental characteristic of such a system is that it becomes unstable for some large feedback delay. … In the context of congestion control, this means that as delay increases, the sources should change their sending rates more slowly.Fuzzy and qualitative (handwavy) design constraints

31. TCP ex Machina: Computer-Generated Congestion Control (SIGCOMM’13)First paper to demonstrate machine-generated logic can defeat handcrafted congestion control (in some cases)Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf31

32. 32If congestion control is the answer, what’s the question?

33. 33If congestion control is the answer, what’s the question?Are there better answers?

34. Free the network to evolveTransport layer should adapt to whatever:network doesapplication wants34

35. A more precise formulation of the congestion control problemControl knobsObjectivesEnvironment (Network model & Traffic model)35

36. The knobs of congestion control36* Superrational congestion control: Assuming every node is running the same algorithm

37. Objectives of congestion control37

38. Assumptions about the environment38

39. Remy: Computer-generated congestion control39RemyControl knobsPerformance objectivesNetwork modelTraffic modelRemyCC:Remy-generatedCongestion control

40. RemyCC workflow40Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packetCongestion control logic

41. RemyCC workflow41Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

42. A RemyCC tracks three congestion signals (State)42

43. A RemyCC maps each state to an action43Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

44. A RemyCC maps each state to an action44Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

45. A RemyCC maps each state to an action45Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

46. RemyCC workflow46Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

47. Remy’s Job47Find piecewise-continuous Rule() that optimizes expected value of objective function.

48. Example run of Remy48

49. 2-D Search Space49Objective: Maximizing

50. One action for all states. Find the best value.50Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

51. The best (single) action. Now split it on median.51Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

52. Simulate52Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

53. Now split the most-used rule53Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

54. Simulate54Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

55. Optimize55Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

56. Split56Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

57. Simulate57Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

58. Optimize58Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

59. Split and so forth…59Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

60. Eventually we get this…60Multiple to congestion windowIncrement to congestion windowMin interval between 2 outgoing packets

61. Scenario 1: details61

62. 1.81.61.41.210.80.6Throughput (Mbps)Better0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

63. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

64. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

65. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

66. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

67. 0.60.81.81.61.41.2112160.432Throughput (Mbps)BetterDelay8unfairness 4Throughput unfairnessMedian outcomeQueueing delay (ms)Scenario 1: throughput-delay plot

68. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

69. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

70. 0.811.81.61.41.2Throughput (Mbps)BetterCubicCompound0.6NewRenoVegas0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

71. 0.40.60.811.81.61.4121632Throughput (Mbps)8 4Queueing delay (ms)BetterNewRenoVegasCubicCompoundXCP1.2Cubic/sfqCoDelScenario 1: throughput-delay plot

72. 0.40.60.811.41.61.8121632Throughput (Mbps)8 4Queueing delay (ms)BetterNewRenoVegasCubicCompoundXCP1.2Cubic/sfqCoDelRemyCCδ=10RemyCCδ=1δ=0.1RemyCCScenario 1: throughput-delay plot

73. 0.40.60.811.41.61.812481632Throughput (Mbps)BetterNewRenoVegasCubicCompoundXCP1.2Cubic/sfqCoDelRemyCCδ=10RemyCCδ=1δ=0.1RemyCCQueueing delay (ms)Scenario 1: throughput-delay plot

74. 0.811.21.41.61.82163264Throughput (Mbps)VegasRemyδ=0.1Remyδ=1Remyδ=10CubicCompoundNewRenoXCPCubic/sfqCoDelQueueing delay (ms)Scenario 2: Verizon LTE, n = 8

75. log(normalized throughput) - log(delay)-1 RemyCC 10x-2-3-4-5-64.7415link speed (megabits/sec)47.4The effect of prior knowledge

76. -2-3-4-5log(normalized throughput) - log(delay)-1 RemyCC 10xRemyCC exact-64.7415link speed (megabits/sec)47.4The effect of prior knowledge

77. -5-4-2-3log(normalized throughput) - log(delay)-1 RemyCC 10xRemyCC exactCubic- over- sfqCoDel-64.7415link speed (megabits/sec)47.4The effect of prior knowledge

78. Critiques?78"End to end" is somehow contradictory to the objective in the Introduction section… traffic model depends on the applications, which can also change frequently and hard to be detectedHard to implement the algorithm in real-world systemsThe core algorithm of Remy needs a huge amount of time to run (CPU weeks)automatic parameter tuning where the space of tuning is still limited to several given variables under some strong assumptions/constraints. It would be interesting to see how well RemyCC could to adapt to changing network conditions over time, and if it could be made to incrementally improveI would like to see a visualization (possibly in 2 dimensions at a time) of how the rulebook actually looks likeI think it would be possible to learn the rulebook through a gradient boosting machine instead of their genetic algorithmTheir approach, imo, could be improved if it was as naive as they described. They could have done something with Bayesian analysis

79. Takeaways (Not just transport protocols)An important problemPervasively used => hard to define goals preciselyTraditionally relying on expert-crafted heuristics/assumptions/philosophiesSounds familiar?Cache policy, routing, database mngt, deep learning arch search, catching bugs, …79

80. Another subtle takeawayThis is not exactly ML you can’t find the word ML/machine-learning/training/etc in the paper!Focus on what’s missing in the literature insteadBut this did pave the road for more blackbox/ML solutionsAn Experimental Study of the Learnability of Congestion Control (SIGCOMM’14)PCC Vivace: Online-Learning Congestion Control (NSDI’18)80

CMSC 34702 ML for End-to-End Adaptation - PowerPoint Presentation

CMSC 34702 ML for End-to-End Adaptation - PPT Presentation

Share:

Link:

Embed:

Related Contents