/
Transport Layer 3- 1 Chapter 3 Transport Layer 3- 1 Chapter 3

Transport Layer 3- 1 Chapter 3 - PowerPoint Presentation

lydia
lydia . @lydia
Follow
65 views
Uploaded On 2023-11-08

Transport Layer 3- 1 Chapter 3 - PPT Presentation

Transport Layer Computer Networking A Top Down Approach 4 th edition Jim Kurose Keith Ross AddisonWesley July 2007 Computer Networking A Top Down Approach 5 th edition Jim Kurose Keith Ross ID: 1030507

layer3 transport data congestion transport layer3 congestion data control ack congwin rate tcp sender segment connection seq reliable window

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Transport Layer 3- 1 Chapter 3" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Transport Layer3-1Chapter 3Transport LayerComputer Networking: A Top Down Approach 4th edition. Jim Kurose, Keith RossAddison-Wesley, July 2007. Computer Networking: A Top Down Approach, 5th edition. Jim Kurose, Keith RossAddison-Wesley, April 2009.

2. Transport Layer3-2Chapter 3: Transport LayerOur goals: understand principles behind transport layer services:Multiplexing, demultiplexingreliable data transferflow controlcongestion controllearn about transport layer protocols in the Internet:UDP: connectionless transportTCP: connection-oriented transportTCP congestion control

3. Transport Layer3-3Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

4. Transport Layer3-4Transport services and protocolsprovide logical communication between app processes running on different hoststransport protocols run in end systems send side: breaks app messages into segments, passes to network layerrcv side: reassembles segments into messages, passes to app layermore than one transport protocol available to appsInternet: TCP and UDPapplicationtransportnetworkdata linkphysicalapplicationtransportnetworkdata linkphysicallogical end-end transport

5. Transport Layer3-5Internet transport-layer protocolsreliable, in-order delivery to app: TCPcongestion control flow controlconnection setupunreliable, unordered delivery to app: UDPno-frills extension of “best-effort” IPservices not available: delay guaranteesbandwidth guaranteesapplicationtransportnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalnetworkdata linkphysicalapplicationtransportnetworkdata linkphysicallogical end-end transport

6. Transport Layer3-6Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

7. Transport Layer3-7Multiplexing/demultiplexingapplicationtransportnetworklinkphysicalP1applicationtransportnetworklinkphysicalapplicationtransportnetworklinkphysicalP2P3P4P1host 1host 2host 3= process= socketdelivering received segmentsto correct socketDemultiplexing at rcv host:gathering data from multiplesockets, enveloping data with header (later used for demultiplexing)Multiplexing at send host:

8. Transport Layer3-8How demultiplexing works: General for TCP and UDPhost receives IP datagramseach datagram has source, destination IP addresseseach datagram carries 1 transport-layer segmenteach segment has source, destination port numbers host uses IP addresses & port numbers to direct segment to appropriate socket, process, applicationsource port #dest port #32 bitsapplicationdata (message)other header fieldsTCP/UDP segment format

9. Transport Layer3-9Connectionless demultiplexingCreate sockets with port numbers:DatagramSocket mySocket1 = new DatagramSocket(12534);DatagramSocket mySocket2 = new DatagramSocket(12535);UDP socket identified by two-tuple:(dest IP address, dest port number)When host receives UDP segment:checks destination port number in segmentdirects UDP segment to socket with that port numberIP datagrams with different source IP addresses and/or source port numbers directed to same socket

10. Transport Layer3-10Connectionless demux (cont)DatagramSocket serverSocket = new DatagramSocket(6428);ClientIP:BP2client IP: AP1P1P3serverIP: CSP: 6428DP: 9157SP: 9157DP: 6428SP: 6428DP: 5775SP: 5775DP: 6428SP provides “return address”

11. Transport Layer3-11Connection-oriented demuxTCP socket identified by 4-tuple: source IP addresssource port numberdest IP addressdest port numberrecv host uses all four values to direct segment to appropriate socketServer host may support many simultaneous TCP sockets:each socket identified by its own 4-tupleWeb servers have different sockets for each connecting clientnon-persistent HTTP will have different socket for each request

12. Transport Layer3-12Connection-oriented demux (cont)ClientIP:BP1client IP: AP1P2P4serverIP: CSP: 9157DP: 80SP: 9157DP: 80P5P6P3D-IP:CS-IP: AD-IP:CS-IP: BSP: 5775DP: 80D-IP:CS-IP: B

13. Transport Layer3-13Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

14. Transport Layer3-14UDP: User Datagram Protocol [RFC 768]“no frills,” “bare bones” transport protocol“best effort” service, UDP segments may be:lostdelivered out of order to appconnectionless:no handshaking between UDP sender, receivereach UDP segment handled independentlyWhy is there a UDP?no connection establishment (which can add delay)simple: no connection state at sender, receiversmall segment headerno congestion control: UDP can blast away as fast as desired (more later on interaction with TCP!)

15. Transport Layer3-15UDP: moreoften used for streaming multimedia appsloss tolerantrate sensitiveother UDP usesDNSSNMP (net mgmt)reliable transfer over UDP: add reliability at app layerapplication-specific error recovery!used for multicast, broadcast in addition to unicast (point-point)source port #dest port #32 bitsApplicationdata (message)UDP segment formatlengthchecksumLength, inbytes of UDPsegment,includingheader

16. Transport Layer3-16Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

17. Transport Layer3-17Principles of Reliable data transferimportant in app., transport, link layerstop-10 list of important networking topics!characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

18. Transport Layer3-18Principles of Reliable data transferimportant in app., transport, link layerstop-10 list of important networking topics!characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

19. Transport Layer3-19Principles of Reliable data transferimportant in app., transport, link layerstop-10 list of important networking topics!characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

20. Transport Layer3-20Reliable data transfer: getting startedsendsidereceivesiderdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layerudt_send(): called by rdt,to transfer packet over unreliable channel to receiverrdt_rcv(): called when packet arrives on rcv-side of channeldeliver_data(): called by rdt to deliver data to upper

21. Transport Layer3-21Flow Control End-to-end flow and Congestion control study is complicated by:Heterogeneous resources (links, switches, applications)Different delays due to network dynamicsEffects of background trafficWe start with a simple case: hop-by-hop flow control

22. Transport Layer3-22Hop-by-hop flow controlApproaches/techniques for hop-by-hop flow controlStop-and-waitsliding windowGo back NSelective reject

23. Transport Layer3-23Stop-and-wait: reliable transfer over a reliable channelunderlying channel perfectly reliableno bit errors, no loss of packetsSender sends one packet, then waits for receiver responsestop and wait

24. Transport Layer3-24channel with bit errorsunderlying channel may flip bits in packetchecksum to detect bit errorsthe question: how to recover from errors:acknowledgements (ACKs): receiver explicitly tells sender that pkt received OKnegative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errorssender retransmits pkt on receipt of NAKnew mechanisms for:error detectionreceiver feedback: control msgs (ACK,NAK) rcvr->sender

25. Transport Layer3-25Stop-and-wait: Corrupt ACK/NACKWhat happens if ACK/NAK corrupted?sender doesn’t know what happened at receiver!can’t just retransmit: possible duplicateHandling duplicates: sender retransmits current pkt if ACK/NAK garbledsender adds sequence number to each pktreceiver discards (doesn’t deliver up) duplicate pkt

26. Transport Layer3-26discussionSender:seq # added to pkttwo seq. #’s (0,1) will suffice. Why?must check if received ACK/NAK corrupted Receiver:must check if received packet is duplicatestate indicates whether 0 or 1 is expected pkt seq #note: receiver can not know if its last ACK/NAK received OK at sender

27. Transport Layer3-27channels with errors and lossNew assumption: underlying channel can also lose packets (data or ACKs)checksum, seq. #, ACKs, retransmissions will be of help, but not enoughApproach: sender waits “reasonable” amount of time for ACK retransmits if no ACK received in this timeif pkt (or ACK) just delayed (not lost):retransmission will be duplicate, but use of seq. #’s already handles thisreceiver must specify seq # of pkt being ACKedrequires countdown timer

28. Transport Layer3-28Stop-and-wait operation SummaryStop and wait:sender awaits for ACK to send another framesender uses a timer to re-transmit if no ACKsif ACK is lost:A sends frame, B’s ACK gets lostA times out & re-transmits the frame, B receives duplicatesSequence numbers are added (frame0,1 ACK0,1)timeout: should be related to round trip time estimatesif too small  unnecessary re-transmissionif too large  long delays

29. Transport Layer3-29Stop-and-wait with lost packet/frame

30. Transport Layer3-30

31. Transport Layer3-31

32. Transport Layer3-32Stop and wait performanceutilization – fraction of time sender busy sendingideal case (error free)u=Tframe/(Tframe+2Tprop)=1/(1+2a), a=Tprop/Tframe

33. Transport Layer3-33Performance of stop-and-waitexample: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet:Ttransmit=8kb/pkt10**9 b/sec= 8 microsecU sender: utilization – fraction of time sender busy sendingL (packet length in bits)R (transmission rate, bps)=1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps linknetwork protocol limits use of physical resources!

34. Transport Layer3-34rdt3.0: stop-and-wait operationfirst packet bit transmitted, t = 0senderreceiverRTT last packet bit transmitted, t = L / Rfirst packet bit arriveslast packet bit arrives, send ACKACK arrives, send next packet, t = RTT + L / R

35. Transport Layer3-35consider lossesassume Timeout ~ 2 Tpropon average need Nx attempts to get the frame throughp is the probability of frame being in errorPr[k attempts are made before the frame is transmitted correctly]=pk-1.(1-p)Nx=kPr[k]=1/(1-p)For stop-and-wait U=Tframe/[Nx.(Tframe+2.Tprop)]=1/Nx(1+2a) U=[1-p]/(1+2a)stop and wait is a conservative approach to flow control but is wasteful

36. Transport Layer3-36Sliding window techniquesTCP is a variant of sliding windowIncludes Go back N (GBN) and selective repeat/rejectAllows for outstanding packets without AckMore complex than stop and waitNeed to buffer un-Ack’ed packets & more book-keeping than stop-and-wait

37. Transport Layer3-37Pipelined (sliding window) protocolsPipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged pktsrange of sequence numbers must be increasedbuffering at sender and/or receiverTwo generic forms of pipelined protocols: go-Back-N, selective repeat

38. Transport Layer3-38Pipelining: increased utilizationfirst packet bit transmitted, t = 0senderreceiverRTT last bit transmitted, t = L / Rfirst packet bit arriveslast packet bit arrives, send ACKACK arrives, send next packet, t = RTT + L / Rlast bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACKIncrease utilizationby a factor of 3!

39. Transport Layer3-39Go-Back-NSender:k-bit seq # in pkt header“window” of up to N, consecutive unack’ed pkts allowedACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK”may receive duplicate ACKs (more later…)timer for each in-flight pkttimeout(n): retransmit pkt n and all higher seq # pkts in window

40. Transport Layer3-40GBN: receiver sideACK-only: always send ACK for correctly-received pkt with highest in-order seq #may generate duplicate ACKsneed only remember expected seq numout-of-order pkt: discard (don’t buffer) -> no receiver buffering!Re-ACK pkt with highest in-order seq #

41. Transport Layer3-41GBN inaction

42. Transport Layer3-42Selective Repeatreceiver individually acknowledges all correctly received pktsbuffers pkts, as needed, for eventual in-order delivery to upper layersender only resends pkts for which ACK not receivedsender timer for each unACKed pktsender windowN consecutive seq #’slimits seq #s of sent, unACKed pkts

43. Transport Layer3-43Selective repeat: sender, receiver windows

44. Transport Layer3-44Selective repeatdata from above :if next available seq # in window, send pkttimeout(n):resend pkt n, restart timerACK(n) in [sendbase,sendbase+N]:mark pkt n as receivedif n smallest unACKed pkt, advance window base to next unACKed seq # senderpkt n in [rcvbase, rcvbase+N-1]send ACK(n)out-of-order: bufferin-order: deliver (also deliver buffered, in-order pkts), advance window to next not-yet-received pktpkt n in [rcvbase-N,rcvbase-1]ACK(n)otherwise: ignore receiver

45. Transport Layer3-45Selective repeat in action

46. Transport Layer3-46Selective repeat: dilemmaExample: seq #’s: 0, 1, 2, 3window size=3receiver sees no difference in two scenarios!incorrectly passes duplicate data as new in (a)Q: what relationship between seq # size and window size?(check hwk), (try applet)

47. Transport Layer3-47performance:selective repeat:error-free case: if the window is w such that the pipe is fullU=100%otherwise U=w*Ustop-and-wait=w/(1+2a)in case of error: if w fills the pipe U=1-potherwise U=w*Ustop-and-wait=w(1-p)/(1+2a)

48. Transport Layer3-48Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

49. Transport Layer3-49TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581full duplex data:bi-directional data flow in same connectionMSS: maximum segment sizeconnection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchangeflow controlled:sender will not overwhelm receiverpoint-to-point:one sender, one receiver reliable, in-order byte steam:no “message boundaries”pipelined:TCP congestion and flow control set window sizesend & receive buffers

50. Transport Layer3-50TCP segment structuresource port #dest port #32 bitsapplicationdata (variable length)sequence numberacknowledgement numberReceive windowUrg data pnterchecksumFSRPAUheadlennotusedOptions (variable length)# bytes rcvr willingto acceptcountingby bytes of data(not segments!)

51. Transport Layer3-51TCP segment structuresource port #dest port #32 bitsapplicationdata (variable length)sequence numberacknowledgement numberReceive windowUrg data pnterchecksumFSRPAUheadlennotusedOptions (variable length)URG: urgent data (generally not used)ACK: ACK #validPSH: push data now(generally not used)RST, SYN, FIN:connection estab(setup, teardowncommands)# bytes rcvr willingto acceptcountingby bytes of data(not segments!)Internetchecksum(as in UDP)

52. Transport Layer3-52TCP seq. #’s and ACKsSeq. #’s:byte stream “number” of first byte in segment’s dataACKs:seq # of next byte expected from other sidecumulative ACKQ: how receiver handles out-of-order segmentsA: TCP spec doesn’t say, - up to implementorHost AHost BSeq=42, ACK=79, data = ‘C’Seq=79, ACK=43, data = ‘C’Seq=43, ACK=80Usertypes‘C’host ACKsreceipt of echoed‘C’host ACKsreceipt of‘C’, echoesback ‘C’timesimple telnet scenario

53. Transport Layer3-53Reliability in TCPComponents of reliability1. Sequence numbers2. Retransmissions3. Timeout Mechanism(s): function of the round trip time (RTT) between the two hosts (is it static?)

54. Transport Layer3-54TCP Round Trip Time and TimeoutQ: how to set TCP timeout value?longer than RTTbut RTT variestoo short: premature timeoutunnecessary retransmissionstoo long: slow reaction to segment lossQ: how to estimate RTT?SampleRTT: measured time from segment transmission until ACK receiptignore retransmissionsSampleRTT will vary, want estimated RTT “smoother”average several recent measurements, not just current SampleRTT

55. Transport Layer3-55TCP Round Trip Time and TimeoutEstimatedRTT(k) = (1- )*EstimatedRTT(k-1) + *SampleRTT(k)=(1- )*((1- )*EstimatedRTT(k-2)+ *SampleRTT(k-1))+  *SampleRTT(k)=(1- )k *SampleRTT(0)+ (1- )k-1 *SampleRTT)(1)+…+  *SampleRTT(k)Exponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value:  = 0.125

56. Transport Layer3-56Example RTT estimation:

57. Transport Layer3-57TCP Round Trip Time and TimeoutSetting the timeoutEstimtedRTT plus “safety margin”large variation in EstimatedRTT -> larger safety margin1. estimate of how much SampleRTT deviates from EstimatedRTT: TimeoutInterval = EstimatedRTT + 4*DevRTTDevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|(typically,  = 0.25) 2. set timeout interval:3. For further re-transmissions (if the 1st re-tx was not Ack’ed) - RTO=q.RTO, q=2 for exponential backoff - similar to Ethernet CSMA/CD backoff

58. Transport Layer3-58Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

59. Transport Layer3-59TCP reliable data transferTCP creates reliable service on top of IP’s unreliable servicePipelined segmentsCumulative acksTCP uses single retransmission timerRetransmissions are triggered by:timeout eventsduplicate acksInitially consider simplified TCP sender: ignore duplicate acksignore flow control, congestion control

60. Transport Layer3-60TCP sender events:data rcvd from app:Create segment with seq #seq # is byte-stream number of first data byte in segmentstart timer if not already running (think of timer as for oldest unacked segment)expiration interval: TimeOutInterval timeout:retransmit segment that caused timeoutrestart timer Ack rcvd:If acknowledges previously unacked segmentsupdate what is known to be ackedstart timer if there are outstanding segments

61. Transport Layer3-61TCP: retransmission scenariosHost ASeq=100, 20 bytes dataACK=100timepremature timeoutHost BSeq=92, 8 bytes dataACK=120Seq=92, 8 bytes dataSeq=92 timeoutACK=120Host ASeq=92, 8 bytes dataACK=100losstimeoutlost ACK scenarioHost BXSeq=92, 8 bytes dataACK=100timeSeq=92 timeoutSendBase= 100SendBase= 120SendBase= 120Sendbase= 100

62. Transport Layer3-62TCP retransmission scenarios (more)Host ASeq=92, 8 bytes dataACK=100losstimeoutCumulative ACK scenarioHost BXSeq=100, 20 bytes dataACK=120timeSendBase= 120

63. Transport Layer3-63Fast RetransmitTime-out period often relatively long:long delay before resending lost packetDetect lost segments via duplicate ACKs.Sender often sends many segments back-to-backIf segment is lost, there will likely be many duplicate ACKs.If sender receives 3 ACKs for the same data, it supposes that segment after ACKed data was lost:fast retransmit: resend segment before timer expires

64. Transport Layer3-64Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

65. Transport Layer3-65(Self-clocking)

66. Transport Layer3-66TCP Flow Controlreceive side of TCP connection has a receive buffer:speed-matching service: matching the send rate to the receiving app’s drain rateapp process may be slow at reading from buffersender won’t overflowreceiver’s buffer bytransmitting too much, too fastflow control

67. Transport Layer3-67TCP Flow control: how it works(Suppose TCP receiver discards out-of-order segments)spare room in buffer= RcvWindow= RcvBuffer-[LastByteRcvd - LastByteRead]Rcvr advertises spare room by including value of RcvWindow in segmentsSender limits unACKed data to RcvWindowguarantees receive buffer doesn’t overflow

68. Transport Layer3-68TCP segment structuresource port #dest port #32 bitsapplicationdata (variable length)sequence numberacknowledgement numberReceive windowUrg data pnterchecksumFSRPAUheadlennotusedOptions (variable length)# bytes rcvr willingto acceptcountingby bytes of data(not segments!)

69. Transport Layer3-69Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

70. Transport Layer3-70TCP Connection ManagementRecall: TCP sender, receiver establish “connection” before exchanging data initialize TCP variables:seq. #sbuffers, flow control info (e.g. RcvWindow)client: connection initiator Socket clientSocket = new Socket("hostname","port number"); server: contacted by client Socket connectionSocket = welcomeSocket.accept();Three way handshake:Step 1: client host sends TCP SYN segment to serverspecifies initial seq #no dataStep 2: server host receives SYN, replies with SYNACK segmentserver allocates buffersspecifies server initial seq. #Step 3: client receives SYNACK, replies with ACK segment, which may contain data

71. Transport Layer3-71TCP Connection Management (cont.)Closing a connection:client closes socket: clientSocket.close(); Step 1: client end system sends TCP FIN control segment to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. clientFINserverACKACKFINclosecloseclosedtimed wait

72. Transport Layer3-72TCP Connection Management (cont.)Step 3: client receives FIN, replies with ACK. Enters “timed wait” - will respond with ACK to received FINs Step 4: server, receives ACK. Connection closed. Note: with small modification, can handle simultaneous FINs.clientFINserverACKACKFINclosingclosingclosedtimed waitclosed

73. Transport Layer3-73TCP Connection Management (cont)TCP clientlifecycleTCP serverlifecycle

74. Transport Layer3-74Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

75. Transport Layer3-75Principles of Congestion ControlCongestion:informally: “too many sources sending too much data too fast for network to handle”different from flow control!manifestations:lost packets (buffer overflow at routers)long delays (queueing in router buffers)a top-10 problem!

76. Transport Layer3-76Congestion Control & Traffic ManagementDoes adding bandwidth to the network or increasing the buffer sizes solve the problem of congestion?No. We cannot over-engineer the whole network due to:Increased traffic from applications (multimedia,etc.)Legacy systems (expensive to update)Unpredictable traffic mix inside the network: where is the bottleneck?Congestion control & traffic management is neededTo provide fairnessTo provide QoS and priorities

77. Transport Layer3-77Network CongestionModeling the network as network of queues: (in switches and routers)Store and forwardStatistical multiplexing

78. Transport Layer3-78Propagation of congestionif flow control is used hop-by-hop then congestion may propagate throughout the network

79. Transport Layer3-79congestion phases and effectsideal case: infinite buffers,Tput increases with demand & saturates at network capacity Representative of Tput-delay design trade-offNetwork Power = Tput/delayTput/GputDelay

80. Transport Layer3-80practical case: finite buffers, lossno congestion --> near ideal performanceoverall moderate congestion:severe congestion in some nodesdynamics of the network/routing and overhead of protocol adaptation decreases the network Tputsevere congestion:loss of packets and increased discardsextended delays leading to timeoutsboth factors trigger re-transmissionsleads to chain-reaction bringing the Tput down

81. Transport Layer3-81(I)(II)(III)(I) No Congestion(II) Moderate Congestion(III) Severe Congestion (Collapse)What is the best operational point and how do we get (and stay) there?

82. Transport Layer3-82Congestion Control (CC)Congestion is a key issue in network designvarious techniques for CC1.Back pressurehop-by-hop flow control (X.25, HDLC, Go back N)May propagate congestion in the network2.Choke packetgenerated by the congested node & sent back to sourceexample: ICMP source quenchsent due to packet discard or in anticipation of congestion

83. Transport Layer3-83Congestion Control (CC) (contd.)3.Implicit congestion signalingused in TCPdelay increase or packet discard to detect congestionmay erroneously signal congestion (i.e., not always reliable) [e.g., over wireless links]done end-to-end without network assistanceTCP cuts down its window/rate

84. Transport Layer3-84Congestion Control (CC) (contd.)4.Explicit congestion signaling(network assisted congestion control)gets indication from the networkforward: going to destinationbackward: going to source3 approachesBinary: uses 1 bit (DECbit, TCP/IP ECN, ATM)Rate based: specifying bps (ATM)Credit based: indicates how much the source can send (in a window)

85. Transport Layer3-85

86. Transport Layer3-86Chapter 3 outline3.1 Transport-layer services3.2 Multiplexing and demultiplexing3.3 Connectionless transport: UDP3.4 Principles of reliable data transfer3.5 Connection-oriented transport: TCPsegment structurereliable data transferflow controlconnection management3.6 Principles of congestion control3.7 TCP congestion control

87. Transport Layer3-87TCP congestion control: additive increase, multiplicative decreaseApproach: increase transmission rate (window size), probing for usable bandwidth, until loss occursadditive increase: increase rate (or congestion window) CongWin until loss detectedmultiplicative decrease: cut CongWin in half after loss timecongestion window sizeSaw toothbehavior: probingfor bandwidth

88. Transport Layer3-88TCP Congestion Control: detailssender limits transmission: LastByteSent-LastByteAcked  CongWinRoughly,CongWin is dynamic, function of perceived network congestionHow does sender perceive congestion?loss event = timeout or duplicate AcksTCP sender reduces rate (CongWin) after loss eventthree mechanisms:AIMDslow startconservative after timeout eventsrate = CongWin RTT Bytes/sec

89. Transport Layer3-89TCP window managementAt any time the allowed window (awnd): awnd=MIN[RcvWin, CongWin], where RcvWin is given by the receiver (i.e., Receive Window) and CongWin is the congestion windowSlow-start algorithm:start with CongWin=1, then CongWin=CongWin+1 with every ‘Ack’This leads to ‘doubling’ of the CongWin with RTT; i.e., exponential increase

90. Transport Layer3-90TCP Slow StartWhen connection begins, CongWin = 1 MSS(MSS: Maximum Segment Size)Example: MSS = 500 bytes & RTT = 200 msecinitial rate = 20 kbpsavailable bandwidth may be >> MSS/RTTdesirable to quickly ramp up to respectable rateWhen connection begins, increase rate exponentially fast until first loss event

91. Transport Layer3-91TCP Slow Start (more)When connection begins, increase rate exponentially until first loss event:double CongWin every RTTdone by incrementing CongWin for every ACK receivedSummary: initial rate is slow but ramps up exponentially fastHost Aone segmentRTTHost Btimetwo segmentsfour segments

92. Transport Layer3-92TCP congestion controlInitially we use Slow start: CongWin = CongWin + 1 with every AckWhen timeout occurs we enter congestion avoidance:ssthresh=CongWin/2, CongWin=1slow start until ssthresh, then increase ‘linearly’CongWin=CongWin+1 with every RTT, orCongWin=CongWin+1/CongWin for every Ackadditive increase, multiplicative decrease (AIMD)

93. Transport Layer3-93

94. Transport Layer3-94

95. Transport Layer3-95Slow startExponential increaseCongestion AvoidanceLinear increaseCongWin(RTT)

96. Transport Layer3-96Refinement: inferring loss(How far should we back off?)After 3 dup ACKs:CongWin is cut in halfwindow then grows linearlyBut after timeout event:CongWin instead set to 1 MSS; window then grows exponentiallyto a threshold, then grows linearly 3 dup ACKs indicates network capable of delivering some segments timeout indicates a “more alarming” congestion scenarioPhilosophy:

97. Transport Layer3-97Fast retransmit:receiver sends Ack with last in-order segment for every out-of-order segment receivedwhen sender receives 3 duplicate Acks it retransmits the missing/expected segmentFast recovery: when 3rd dup Ack arrivesssthresh=CongWin/2retransmit segment, set CongWin=ssthresh+3for every duplicate Ack: CongWin=CongWin+1 (note: beginning of window is ‘frozen’)after receiver gets cumulative Ack: CongWin=ssthresh (beginning of window advances to last Ack’ed segment)Fast Retransmit & RecoveryCongWin

98. Transport Layer3-98

99. Transport Layer3-99CongWin Fast Recovery

100. Transport Layer3-100(I)(II)(III)(I) No Congestion(II) Moderate Congestion(III) Severe Congestion (Collapse)Where does TCP operate on this curve?

101. Transport Layer3-101Summary: TCP Congestion ControlWhen CongWin is below Threshold, sender in slow-start phase, window grows exponentially.When CongWin is above Threshold, sender is in congestion-avoidance phase, window grows linearly.When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold.When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS.

102. Transport Layer3-102Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/KTCP connection 1bottleneckroutercapacity RTCP connection 2TCP Fairness

103. Transport Layer3-103Fairness (more)Fairness and UDPMultimedia apps often do not use TCPdo not want rate throttled by congestion controlInstead use UDP:pump audio/video at constant rate, tolerate packet lossResearch area: TCP friendly protocols!Fairness and parallel TCP connectionsnothing prevents app from opening parallel connections between 2 hosts.Web browsers do this Example: link of rate R supporting 9 connections; new app asks for 1 TCP, gets rate R/10new app asks for 11 TCPs, gets R/2 !

104. Transport Layer3-104Congestion Control with Explicit NotificationTCP uses implicit signalingATM (ABR) uses explicit signaling using RM (resource management) cellsATM: Asynchronous Transfer Mode, ABR: Available Bit RateABR Congestion notification and congestion avoidanceparameters: peak cell rate (PCR)minimum cell rate (MCR)initial cell rate(ICR)

105. Transport Layer3-105Case study: ATM ABR congestion controlABR: available bit rate:“elastic service” if sender’s path “underloaded”: sender should use available bandwidthif sender’s path congested: sender throttled to minimum guaranteed rateRM (resource management) cells:sent by sender, interspersed with data cellsbits in RM cell set by switches (“network-assisted”) NI bit: no increase in rate (mild congestion)CI bit: congestion indicationRM cells returned to sender by receiver, with bits intact

106. Transport Layer3-106ABR uses resource management cell (RM cell) with fields:CI (congestion indication)NI (no increase)ER (explicit rate)Types of RM cells: Forward RM (FRM)Backward RM (BRM)

107. Transport Layer3-107Congestion notification using RM cellsRM cell every Nrm-1 data cellsIf congestion:The switch may set EFCI (explicit forward congestion indication) in ATM cell header. Then the destination sets the CI=1 in RM cell going back to the source (ER) is modified.The switch may set CI & NI bits in the RM cell and either send RM cell to destination (FRM) or send BRM back to the source (with decreased latency)The switch sets ER field in BRM

108. Transport Layer3-108

109. Transport Layer3-109Congestion Control in ABRThe source reacts to congestion notification by decreasing its rate (rate-based vs. window-based for TCP)Rate adaptation algorithm:If CI=0,NI=0Rate increase by factor ‘RIF’ (e.g., 1/16)Rate = Rate + PCR/16Else If CI=1 Rate decrease by factor ‘RDF’ (e.g., 1/4)Rate=Rate-Rate*1/4

110. Transport Layer3-110

111. Transport Layer3-111Which VC to notify when congestion occurs?FIFO, if Qlength > 80%, then keep notifying arriving cells until Qlength < lower threshold (this is unfair)Use several queues: called Fair QueuingUse fair allocation = target rate/# of VCs = R/NIf current cell rate (CCR) > fair share, then notify the corresponding VC

112. Transport Layer3-112What to notify? CINIER (explicit rate) schemes perform the steps:Compute the fair shareDetermine load & congestionCompute the explicit rate & send it back to the sourceShould we put this functionality in the network?

113. Transport Layer3-113Chapter 3: Summaryprinciples behind transport layer services:multiplexing, demultiplexingreliable data transferflow controlcongestion controlTCP, ATM ABRNext:leaving the network “edge” (application, transport layers)into the network “core”