/
CMSC 34702 ML for End-to-End CMSC 34702 ML for End-to-End

CMSC 34702 ML for End-to-End - PowerPoint Presentation

fiona
fiona . @fiona
Follow
66 views
Uploaded On 2023-09-08

CMSC 34702 ML for End-to-End - PPT Presentation

Adaptation Congestion Control Junchen Jiang October 17 2019 1 Congestion Control for High BandwidthDelay Product Networks TCP ex Machina ComputerGenerated Congestion Control 2 Background ID: 1015948

delay congestion throughput control congestion delay control throughput xcp bandwidth mbps scenario plot tcp sigcomm packet rtt trip flows

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "CMSC 34702 ML for End-to-End" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. CMSC 34702ML for End-to-End Adaptation(Congestion Control)Junchen JiangOctober 17, 20191

2. Congestion Control for High Bandwidth-Delay Product NetworksTCP ex Machina: Computer-Generated Congestion Control2

3. Background(Based on the lecture from https://www.youtube.com/watch?v=HxOA8HrVzPg)3

4. Layered view of the Internet4E2ENetwork(host naming, routing, etc)LinkReliabilityTimerSliding windowFlow controlSharing (Congestion control)

5. What is congestion control?5100Mbps100Mbps100Mbps100Mbps100Mbps100Mbps10MbpsLiC (Ideal) objective:

6. Why is it hard?6100Mbps100Mbps100Mbps100Mbps100Mbps100Mbps10MbpsLiC (Ideal) objective: To scale out, network must keep no per-flow state

7. Sharing a network using buffering7

8. Buffering  Congestion collapse8Sum of load () ThroughputC

9. Goals of congestion control9 Objective: Avoid congestion collapseReasonably high link utilizationFairnessDynamic N (# of competing flows)Wide range of C (link capacity)Delayed feedbackUnder

10. Congestion control framework10PacketAckSenderReceiverState: Estimation of RTT (round-trip time)Congestion window (# of outstanding packets)Decision: When to send out the next packet?

11. Example: AIMD11PacketAckSenderReceiverState: EWMA RTT: Estimation of RTT (round-trip time)Cwnd: Congestion window (# of outstanding packets)Decision: When to send out the next packet?AIMD logic:On Ack: Cwnd += 1; update ewma_rttOn packet loss/timeout: Cwnd /= 2; update ewma_rtt

12. Congestion Control for High Bandwidth-Delay Product Networks (sigcomm’02)First systematic approach to congestion control under high bw-delay product networks which were to dominate in the next decade.(Based on slides from R. Stallings, M. Handley and D. Katabi)12

13. TCP congestion control performs poorly as bandwidth or delay increases13Round Trip Delay (sec)Avg. TCP UtilizationBottleneck Bandwidth (Mb/s)Avg. TCP UtilizationShown analytically in [Low01] and via simulationsBecause TCP lacks fast response Spare bandwidth is available  TCP increases by 1 pkt/RTT even if spare bandwidth is huge When a TCP starts, it increases exponentially  Too many drops  Flows ramp up by 1 pkt/RTT, taking forever to grab the large bandwidth50 flows in both directionsBuffer = BW x DelayRTT = 80 ms50 flows in both directionsBuffer = BW x DelayBW = 155 Mb/s

14. 14High Utilization; Small Queues; Few DropsBandwidth Allocation PolicySolution: Decouple Congestion Control from FairnessExample: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls bothKey reason: Coupled congestion control and fairenessA single mechanism controls both

15. 15Solution: Decouple Congestion Control from FairnessExample: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls bothKey reason: Coupled congestion control and fairenessA single mechanism controls bothHow does decoupling solve the problem? To control congestion: use MIMD which shows fast responseTo control fairness: use AIMD which converges to fairness

16. Nice properties of XCPImproved Congestion Control:Small queuesAlmost no dropsImproved FairnessScalable (no per-flow state)16

17. XCP: An eXplicit Control Protocol17 Congestion Controller Fairness Controller

18. Feedback Round Trip TimeCongestion WindowCongestion HeaderFeedback Round Trip TimeCongestion Window How does XCP Work?Feedback = + 0.1 packet

19. Feedback = + 0.1 packet Round Trip TimeCongestion WindowFeedback = - 0.3 packet How does XCP Work?

20. Congestion Window = Congestion Window + FeedbackRouters compute feedback without any per-flow state How does XCP Work?XCP extends ECN and CSFQ

21. How Does an XCP Router Compute the Feedback?21Congestion ControllerFairness ControllerGoal: Divides  between flows to converge to fairnessLooks at a flow’s state in Congestion Header Algorithm:If  > 0  Divide  equally between flowsIf  < 0  Divide  between flows proportionally to their current rates MIMD AIMDGoal: Matches input traffic to link capacity & drains the queueLooks at aggregate traffic & queueAlgorithm:Aggregate traffic changes by  ~ Spare Bandwidth ~ - Queue SizeSo,  =  davg Spare -  QueueCongestion ControllerFairness Controller

22. Getting the devil out of the details …22 =  davg Spare -  QueueTheorem: System converges to optimal utilization (i.e., stable) for any link bandwidth, delay, number of sources if:(Proof based on Nyquist Criterion)Congestion ControllerFairness ControllerNo Parameter Tuning Algorithm:If  > 0  Divide  equally between flowsIf  < 0  Divide  between flows proportionally to their current ratesNeed to estimate number of flows NRTTpkt : Round Trip Time in header Cwndpkt : Congestion Window in headerT: Counting IntervalNo Per-Flow State

23. XCP Remains Efficient as Bandwidth or Delay Increases23Avg. UtilizationAvg. UtilizationBottleneck Bandwidth (Mb/s)Utilization as a function of Delay XCP increases proportionally to spare bandwidth and  chosen to make XCP robust to delayUtilization as a function of Bandwidth Round Trip Delay (sec)

24. Traditional view of congestion controlWhat feedback from the network (routers)?Implicit: Ack, Timeout, Duplicated acks, RTT inflation, … (e.g., Linux Cubic)Explicit: Router-generated marks (e.g., XCP, ECN)How should end hosts react to feedback?AIMD: Arithmetic increase, multiplicative decrease (Reno)Growing cwnd as a cubic function of time (Linux Cubic)Explicitly modeling the network capacity (TFRC)Relying entirely on router-generated marks (e.g., XCP)24

25. Heavily depends on operational environments1980s-1990s: Local-Area NetworksReno, Tahoe, NewReno1990s-2000s: Diverse environmentsHigh bandwidth-delay product (wireless, satellite, high capacity WAN)XCP, Vegas, Cubic, Compound TCP, …2010s: DatacenterMicro-second RTT, 10s GB bandwidth, delay-sensitive apps, Hardware supportDCTCP, Deadline-driven TCP, …25

26. The march of congestion control mechanisms26Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdfWhy so many designs?

27. Goals of congestion control27 Objective: Avoid congestion collapseReasonably high link utilizationFairnessDynamic NWide range of CDelayed feedbackUnderProblem formulations are too vague!

28. Rational choice of scheme is challengingDifferent goals?Different assumptions about network?One scheme just plain better?28VS.

29. Networks constrained by a fuzzy idea of TCP’s assumptionsMask stochastic lossBufferbloatMask out-of-order deliveryNo parallel/multipath routing29Advice for Internet Subnetwork Designers (RFC 3819) is 21,000 words!Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf

30. Design rationale? (from XCP paper)30Congestion is not a binary variable, so congestion signaling should reflect the degree of congestion. … This [XCP] allows the senders to decrease their sending windows quickly when the bottleneck is highly congested, … . The resulting protocol is both more responsive and less oscillatory.A fundamental characteristic of such a system is that it becomes unstable for some large feedback delay. … In the context of congestion control, this means that as delay increases, the sources should change their sending rates more slowly.Fuzzy and qualitative (handwavy) design constraints

31. TCP ex Machina: Computer-Generated Congestion Control (SIGCOMM’13)First paper to demonstrate machine-generated logic can defeat handcrafted congestion control (in some cases)Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf31

32. 32If congestion control is the answer, what’s the question?

33. 33If congestion control is the answer, what’s the question?Are there better answers?

34. Free the network to evolveTransport layer should adapt to whatever:network doesapplication wants34

35. A more precise formulation of the congestion control problemControl knobsObjectivesEnvironment (Network model & Traffic model)35

36. The knobs of congestion control36* Superrational congestion control: Assuming every node is running the same algorithm

37. Objectives of congestion control37

38. Assumptions about the environment38

39. Remy: Computer-generated congestion control39RemyControl knobsPerformance objectivesNetwork modelTraffic modelRemyCC:Remy-generatedCongestion control

40. RemyCC workflow40Congestion control logicFeedback(acks, timeout, …)When to send the next packet

41. RemyCC workflow41Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

42. A RemyCC tracks three congestion signals (State)42

43. A RemyCC maps each state to an action43

44. RemyCC workflow44Congestion signalsFeedback(acks, timeout, …)MIMD ParametersWhen to send the next packet

45. Remy’s Job45Find piecewise-continuous Rule() that optimizes expected value of objective function.

46. Example run of Remy46

47. 2-D Search Space47Objective: Maximizing

48. One action for all states. Find the best value.48

49. The best (single) action. Now split it on median.49

50. Simulate50

51. Optimize each of the new actions51

52. Now split the most-used rule52

53. Simulate53

54. Optimize54

55. Split55

56. Simulate56

57. Optimize57

58. Split and so forth…58

59. Eventually we get this…59

60. Scenario 1: details60

61. 1.81.61.41.210.80.6Throughput (Mbps)Better0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

62. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

63. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

64. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

65. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

66. 0.60.81.81.61.41.2112160.432Throughput (Mbps)BetterDelay8unfairness 4Throughput unfairnessMedian outcomeQueueing delay (ms)Scenario 1: throughput-delay plot

67. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

68. 0.61.81.61.41.210.8Throughput (Mbps)BetterNewReno0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

69. 0.811.81.61.41.2Throughput (Mbps)BetterCubicCompound0.6NewRenoVegas0.432168421Queueing delay (ms)Scenario 1: throughput-delay plot

70. 0.40.60.811.81.61.4121632Throughput (Mbps)8 4Queueing delay (ms)BetterNewRenoVegasCubicCompoundXCP1.2Cubic/sfqCoDelScenario 1: throughput-delay plot

71. 0.40.60.811.41.61.8121632Throughput (Mbps)8 4Queueing delay (ms)BetterNewRenoVegasCubicCompoundXCP1.2Cubic/sfqCoDelRemyCCδ=10RemyCCδ=1δ=0.1RemyCCScenario 1: throughput-delay plot

72. 0.40.60.811.41.61.812481632Throughput (Mbps)BetterNewRenoVegasCubicCompoundXCP1.2Cubic/sfqCoDelRemyCCδ=10RemyCCδ=1δ=0.1RemyCCQueueing delay (ms)Scenario 1: throughput-delay plot

73. 0.811.21.41.61.82163264Throughput (Mbps)VegasRemyδ=0.1Remyδ=1Remyδ=10CubicCompoundNewRenoXCPCubic/sfqCoDelQueueing delay (ms)Scenario 2: Verizon LTE, n = 8

74. log(normalized throughput) - log(delay)-1 RemyCC 10x-2-3-4-5-64.7415link speed (megabits/sec)47.4The effect of prior knowledge

75. -2-3-4-5log(normalized throughput) - log(delay)-1 RemyCC 10xRemyCC exact-64.7415link speed (megabits/sec)47.4The effect of prior knowledge

76. -5-4-2-3log(normalized throughput) - log(delay)-1 RemyCC 10xRemyCC exactCubic- over- sfqCoDel-64.7415link speed (megabits/sec)47.4The effect of prior knowledge

77. Critiques?77

78. Takeaways (Not just transport protocols)An important problemPervasively used => hard to define goals preciselyTraditionally relying on expert-crafted heuristics/assumptions/philosophiesSounds familiar?Cache policy, routing, database mngt, deep learning arch search, catching bugs, …78

79. Another subtle takeawayThis is not exactly ML you can’t find the word ML/machine-learning/training/etc in the paper!Focus on what’s missing in the literature insteadBut this did pave the road for more blackbox/ML solutionsE.g., “An Experimental Study of the Learnability of Congestion Control” SIGCOMM 201479