EE122 Fall 2011 Scott Shenker http insteecsberkeleyedu ee122 Materials with thanks to Jennifer Rexford Ion Stoica Vern Paxson and other colleagues at Princeton and UC Berkeley ID: 581779
Download Presentation The PPT/PDF document "1 Midterm Review" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
1
Midterm Review
EE122 Fall 2011Scott Shenkerhttp://inst.eecs.berkeley.edu/~ee122/Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxsonand other colleagues at Princeton and UC BerkeleySlide2
AnnouncementsAvailable after classI hate these review lectures….
2Slide3
AgendaFinish Web cachingMidterm review
3Slide4
Finishing Up Web Caching
4Slide5
5
Improving HTTP Performance:
CachingMany clients transfer same information Generates redundant server and network load
Clients experience
unnecessary
latency
Server
Clients
Backbone ISP
ISP-1
ISP-2Slide6
6
Improving HTTP Performance:
Caching: HowResponse header:Expires – how long it’s safe to cache the resource
No-cache
– ignore all caches; always get resource directly from
server
If entry has not expired, cache returns it
Otherwise, it issues an if-modified-since
Modifier to GET requests:
If-modified-since
– returns
“
not modified
”
if resource not modified since specified time Slide7
7
Improving HTTP Performance:
Caching: WhyMotive for placing content closer to client:User gets better response timeContent providers get happier usersTime is money, really!Network gets reduced load
How well does caching work?
Very well, up to a limit
Large overlap in content
But many unique
requests
sound familiar?Slide8
8
Improving HTTP Performance:
Caching with Reverse ProxiesCache documents close to server decrease server loadTypically done by content providers
Only works for
static
content
Clients
Backbone ISP
ISP-1
ISP-2
Server
Reverse proxiesSlide9
9
Improving HTTP Performance:
Caching with Forward ProxiesCache documents close to clients reduce network traffic and decrease latencyTypically done by ISPs or corporate LANs
Clients
Backbone ISP
ISP-1
ISP-2
Server
Reverse proxies
Forward proxiesSlide10
10
Improving HTTP Performance:
Caching w/ Content Distribution NetworksIntegrate forward and reverse caching functionalityOne overlay network (usually) administered by one entitye.g., AkamaiProvide document cachingPull: Direct result of clients’
requests
Push:
Expectation of high access rate
Also do some processing
Handle
dynamic
web pages
Transcoding
Slide11
11
Improving HTTP Performance:
Caching with CDNs (cont.)
Clients
ISP-1
Server
Forward proxies
Backbone ISP
ISP-2
CDNSlide12
12
Improving HTTP Performance:
CDN Example – AkamaiAkamai creates new domain names for each client content provider.e.g., a128.g.akamai.netThe CDN’s DNS servers are authoritative for the new domains
The client content provider modifies its content so that embedded URLs reference the new domains.
“
Akamaize
”
content
e.g.:
http://
www.cnn.com
/image-of-the-
day.gif
becomes
http://a128.g.akamai.net/image-of-the-
day.gif
Requests now sent to CDN’s infrastructure…Slide13
13
Hosting: Multiple Sites Per MachineMultiple Web sites on a single machine
Hosting company runs the Web server on behalf of multiple sites (e.g., www.foo.com and www.bar.com)Problem: GET /index.htmlwww.foo.com/index.html or www.bar.com/index.html?
Solutions:
Multiple server processes on the same machine
Have a separate IP address (or port) for each server
Include site name in HTTP request
Single Web server process with a single IP address
Client includes
“
Host
”
header (
e.g.,
Host: www.foo.com
)
Required header with HTTP/1.1Slide14
14
Hosting: Multiple
Machines Per SiteReplicate popular Web site across many machinesHelps to handle the loadPlaces content closer to clients
Helps when content isn’t
cacheable
Problem: Want to direct client to
particular
replica
Balance load across server replicas
Pair clients with nearby serversSlide15
15
Multi-Hosting at Single Location
Single IP address, multiple machinesRun multiple machines behind a single IP address
Ensure all packets from a single
TCP connection go to the same replica
Load Balancer
64.236.16.20Slide16
16
Multi-Hosting at Several Locations
Multiple addresses, multiple machinesSame name but different addresses for all of the replicasConfigure DNS server to return different addresses
Internet
64.236.16.20
173.72.54.131
12.1.1.1Slide17
Midterm Review
17Slide18
My General Philosophy on TestsI am not a sadistI am not a masochistFor those of you who only read the slides at home:If you don’t attend lectures, then it is your own damn fault if you missed something….I believe in testing your understanding of the basics, not tripping you up on tiny details or making you calculate pi to 15 decimal places
18Slide19
General GuidelinesKnow the basics well, rather than focus on detailsStudy lecture notes and problem setsRemember: you can use a crib sheet…..10pt fontRead text only for general context and to learn certain detailsJust because I didn’t cover it in review doesn’t mean you don’t need to know it!Get plenty of sleep
19Slide20
Things You Don’t Need to KnowThe details of how to fragment packetsThe details of any protocol headerKnow semantics, but not syntaxAny details of DNS, HTTP (thank Ganesh)Just know that when you access a web page, you do a DNS request and then an HTTP request
DNS request, DNS reply, SYN, SYNACK, ACK, HTTP Request, HTTP Reply, FIN, FINACK, ACK
20Slide21
First half of course: BasicsGeneral background (3 lectures)Basic design principlesIdealized view of network (4 lectures)RoutingReliabilityMaking this vision real (5 lectures)IP, TCP, DNS,
WebEmphasize concepts, but deal with unpleasant realities
21Slide22
22
General BackgroundSlide23
23
Overview of the Internet
The Internet is a large complicated system that must meet an unprecedented variety of challengesScale, dynamic range, diversity, ad hoc, failures, asynchrony, malice, and greedAn amazing feat of engineeringWent against the conventional wisdomCreated a new networking paradigm
I
n hindsight, some aspects of design are terrible
Will revisit when we do the clean slate design
But enormity of genius far outweighs the oversightsSlide24
Internet’s Five Basic Design DecisionsPacket-switchingBest-effort service modelA single internetworking layerLayeringThe end-to-end principle (and fate-sharing)
24Slide25
25
Packet-Switching vs. Circuit-Switching
Reliability advantage: since routers don’t know about individual conversations, when a router or link fails, it iseasy
to fail over to a different path
Efficiency advantage
of packet-switching over circuit switching:
Exploitation of statistical multiplexing
Deployability
advantage
: easier for different parties to link their networks together because they
’
re
not
promising to reserve resources for one another
Disadvantage
: packet-switching must handle congestion
More complex routers (more buffering, sophisticated dropping)Harder to provide good network services (e.g., delay and bandwidth guarantees)Slide26
What service should Internet support?Strict delay bounds?Some applications require themGuaranteed delivery?Some applications are sensitive to packet dropsNo applications mind getting good serviceWhy not require Internet support these guarantees?26Slide27
Important life lessonsPeople (applications) don’t always need what they think they needPeople (applications) don’t always need what we think they needFlexibility often more important than performanceBut typically only in hindsight!Example: cell phones vs landlinesArchitect for flexibility, engineer for performance27Slide28
Applying lessons to InternetRequiring performance guarantees would limit variety of networks that could attach to InternetMany applications don’t need these guaranteesAnd those that do? Well, they don’t either (usually)Tremendous ability to mask drops, delaysAnd ISPs can work hard to deliver good service without changing the architecture28Slide29
29Kahn’s Rules for InterconnectionEach network is independent and must not be required to change (why?)Best-effort communication (why?)Boxes (routers) connect networksNo global control at operations level (why?)Slide30
Tasks in Networking (bottom up)Electrons on wireBits on wirePackets on wireDeliver packets across local networkLocal addressesDeliver packets across countryGlobal addressesEnsure that packets get thereDo something with the data30Slide31
Resulting LayersElectrons on wire (contained in next layer)Bits on wire (Physical)Packets on wire (contained in next layer)Deliver packets across local network (Link)Local addressesDeliver packets across country (Internetwork)Global addressesEnsure that packets get there (Transport)Do something with the data (Application)31Slide32
Decisions and Their PrinciplesHow to break system into modulesDictated by LayeringWhere modules are implementedDictated by End-to-End PrincipleWhere state is storedDictated by Fate-Sharing32Slide33
33
Who Does What?Five layers
Lower three layers implemented everywhereTop two layers implemented only at hostsWhat is top layer of router doing?
Transport
Network
Datalink
Physical
Transport
Network
Datalink
Physical
Network
Datalink
Physical
Application
Application
Host A
Host B
Router
What about switches?Slide34
34
Layer Encapsulation
Trans: Connection ID
Net: Source/Dest
Link: Src/Dest
Appl: Get index.html
User A
User B
Common case: 20 bytes TCP header + 20 bytes IP header
+ 14 bytes Ethernet header =
54 bytes overheadSlide35
35
Pontifications….Slide36
General Rules of System DesignSystem not scalable?Add hierarchyDNS, IP addressingSystem not flexible?Add layer of indirectionDNS names (rather than using IP addresses as names)System not performing well?Add cachesWeb and DNS caching
36Slide37
The Paradox of Internet TrafficThe majority of flows are shortA few packetsThe majority of bytes are in long flowsMB or moreAnd this trend is accelerating…37Slide38
A Common Pattern…..Distributions of various metrics (file lengths, access patterns, etc.) often have two properties:Large fraction of total metric in the top 10%Sizable fraction (~10%) of total fraction in low valuesNot an exponential distributionLarge fraction is in top 10%But low values have very little of overall totalLesson: have to pay attention to both ends of dist.38Slide39
39
Fundamental Tasks:
Routing and ReliabilitySlide40
40
RoutingSlide41
“Valid” Routing StateGlobal routing state is “valid” if it produces forwarding decisions that always deliver packets to their destinationsValid is my terminology, not standardGoal of routing protocols: compute valid stateBut how can you tell if routing state if valid?41Slide42
Necessary and Sufficient ConditionGlobal routing state is valid if and only if:There are no dead ends (other than destination)There are no loops42Slide43
How Can You Avoid Loops?Restrict topology to spanning treeIf the topology has no loops, packets can’t loop!Computation over entire graphCan make sure no loopsLink-StateMinimizing metric in distributed computationLoops are never the solution to a minimization problemDistance vectorWon’t review LS/DV, but will review learning switch43Slide44
Easiest Way to Avoid LoopsUse a topology where loops are impossible!Take arbitrary topologyBuild spanning tree (algorithm covered later)Ignore all other links (as before)Only one path to destinations on spanning treesUse “learning switches” to discover these pathsNo need to compute routes, just observe them44Slide45
A Spanning Tree45Slide46
Flooding on a Spanning TreeIf you want to send a packet that will reach all nodes, then switches can use the following rule:Ignoring all ports not on spanning tree!Originating switch sends “flood” packet out all portsWhen a “flood” packet arrives on one incoming port, send it out all other portsThis works because the lack of loops prevents the flooding from cycling back on itselfEventually all nodes will be covered, exactly once46Slide47
Flooding on Spanning Tree47Slide48
This Enables Learning!There is only one path from source to destinationEach switch can learn how to reach a another node by remembering where its flooding packets came from!If flood packet from Node A entered switch from port 4, then to reach Node A, switch sends packets out port 448Slide49
Learning from Flood Packets49
Node A
Node A can be
reached
through this port
Node A can be
reached
through this port
Once a node has sent a flood message, all other switches know how to reach it….Slide50
50
Self-Learning Switch
When a packet arrivesInspect source ID, associate with
incoming
port
Store mapping in the switch table
Use
time-to-live
field to eventually forget
mapping
A
B
C
D
Packet tells switch how
to reach A.Slide51
51
Self Learning: Handling Misses
When packet arrives with unfamiliar destinationForward packet out all other portsResponse will teach switch about that destination
A
B
C
D
When in doubt, shout!Slide52
52
General Rule
When switch receives a packet:index the switch table using destination IDif entry found for destination
{
if
dest
on
port from
which
packet arrived
then
drop
packet
else forward packet on port indicated }
else
flood
forward on all but the interface
on which the frame arrived
Why do this?Slide53
Reliability Correctness ConditionPacket is always resent if the previous transmission was lost or corrupted.Packet may be resent at other times.Need not specify this portion…All the rest is just implementing this invariant53Slide54
54
Core of Real Architecture
Addressing, Forwarding, TCP, DNS, WebSlide55
What Tasks Do We Need to Do?Read packet correctlyGet packet to the destinationGet responses to the packet back to sourceCarry dataTell host what to do with packet once arrivedSpecify any special network handling of the packetDeal with problems that arise along the path55Slide56
Dealing with ProblemsIs packet caught in loop? TTLHeader Corrupted: Detect with ChecksumWhat about payload checksum?Packet too large? Deal with fragmentationSplit packet apartKeep track of how to put together56Slide57
IP Packet Structure
4-bit
Version
4-bit
Header
Length
8-bit
Type of Service
(TOS)
16-bit Total Length (Bytes)
16-bit Identification
3-bit
Flags
13-bit Fragment Offset
8-bit Time to
Live (TTL)
8-bit Protocol
16-bit Header Checksum
32-bit Source IP Address
32-bit Destination IP Address
Options (if any)
PayloadSlide58
IPv4 and IPv6 Header ComparisonVersion
IHL
Type of Service
Total Length
Identification
Flags
Fragment Offset
Time to Live
Protocol
Header Checksum
Source Address
Destination Address
Options
Padding
Version
Traffic Class
Flow Label
Payload Length
Next Header
Hop Limit
Source Address
Destination Address
IPv4
IPv6
Field
name
kept from IPv4 to IPv6
Fields not kept in IPv6
Name & position changed in IPv6
New field in IPv6Slide59
Summary of ChangesEliminated fragmentation (why?)Eliminated header length (why?)Eliminated checksum (why?)New options mechanism (next header) (why?)Expanded addresses (why?)
Added Flow Label (why?)
59Slide60
Philosophy of ChangesDon’t deal with problems: leave to endsEliminated fragmentationEliminated checksumWhy retain TTL?Simplify handling:New options mechanism (uses next header approach)Eliminated header lengthWhy couldn’t IPv4 do this?Provide general flow label for packetNot tied to semanticsProvides great flexibility
60Slide61
Comparison of Design PhilosophyVersion
IHL
Type of Service
Total Length
Identification
Flags
Fragment Offset
Time to Live
Protocol
Header Checksum
Source Address
Destination Address
Options
Padding
Version
Traffic Class
Flow Label
Payload Length
Next Header
Hop Limit
Source Address
Destination Address
IPv4
IPv6
T
o Destination and Back (expanded)
Deal with Problems (greatly reduced)
Read Correctly (reduced)
Special Handling (similar)Slide62
Original Internet AddressesFirst eight bits: network address (/8)Last 24 bits: host addressAssumed 256 networks were more than enough!
62Slide63
63
Next Design: Classful Addressing
Class A: if first byte in [0..127] assume /8 (top bit = 0)Very large blocks (e.g., MIT has 18.0.0.0/8)
Class B: first byte in [128..191]
assume /16
(top bits = 10)
Large blocks (
e.g
,. UCB
has
128.32.0.0/16)
Class C: [192..223]
assume /24
(top bits = 110)
Small blocks (e.g., ICIR has 192.150.187.0/24)
(My house
used to have a
/25)
0*******
********
********
********
10******
********
********
********
110*****
********
********
********Slide64
64
Classful Addressing (cont’
d)Class D: [224..239] (top bits 1110)Multicast groups
Class E: [240..255]
(top bits 11110)
Reserved for future use
What problems can
classful
addressing lead to?
Only comes in 3 sizes
Routers can end up knowing about
many
class
C
’
s (/24s)
Wasted address space
1110****
********
********
********
11110***
********
********
********Slide65
Today’s Addressing: CIDRCIDR = Classless Interdomain RoutingFlexible division between network and host addressesMust specify both address and maskClarifies where boundary between addresses liesClassful addressing communicate this with first few bitsCIDR requires explicit mask
65Slide66
66
CIDR Addressing
IP Address : 12.4.0.0 IP Mask: 255.254.0.0
00001100
00000100
00000000
00000000
11111111
11111110
00000000
00000000
Address
Mask
for hosts
Network Prefix
Use two 32-bit numbers to represent a network.
Network number = IP address + Mask
Written as 12.4.0.0/15 or 12.4/15Slide67
67
Obtaining a Block of Addresses
Allocation is also hierarchicalPrefix: assigned to an institutionAddresses: assigned by the institution to their nodesWho assigns prefixes?
Internet Corporation for Assigned Names and Numbers
Allocates large address blocks to
Regional Internet Registries
ICANN is
politically charged
Regional Internet Registries (RIRs)
E.g.,
ARIN
(American Registry for Internet Numbers)
Allocates address blocks within their regions
Allocated to Internet Service Providers and large institutions ($$)
Internet Service Providers (ISPs)
Allocate address blocks to their customers (could be recursive)
Often w/o chargeSlide68
68
Dynamic Host Configuration Protocol
arriving
client
DHCP server
203.1.2.5
DHCP discover
(broadcast)
DHCP offer
DHCP request
DHCP ACK
(broadcast)
Why all the broadcasts?
(broadcast)
(broadcast)Slide69
69
Network Address Translation (NAT)
Before NAT…every machine connected to Internet had unique IP address
1.2.3.4
1.2.3.5
5.6.7.8
LAN
Clients
Server
Internet
1.2.3.4
5.6.7.8
80
1001
dest addr
src addr
dst port
src port
5.6.7.8
1.2.3.4
80
1001Slide70
70
NAT (cont
’d)Assign addresses to machines behind same NAT
Usually in address block
192.168.0.0/16
Use
port
numbers to
multiplex single address
192
.2.3.4
192
.2.3.5
5.6.7.8
Clients
Server
Internet
NAT
1.2.3.4
5.6.7.8
192
.2.3.4
80
1001
192.2.3.4:1001
1.2.3.4:2000
5.6.7.8
1.2.3.4
80
2000
1.2.3.4
5.6.7.8
80
2000
5.6.7.8
192
.2.3.4
80
1001Slide71
71
NAT (cont
’d)
192
.2.3.4
192
.2.3.5
5.6.7.8
Clients
Server
Internet
NAT
1.2.3.4
192.2.3.4:1001
1.2.3.4:2000
5.6.7.8
1.2.3.4
80
2001
1.2.3.4
5.6.7.8
80
2001
5.6.7.8
192
.2.3.5
80
1001
192.2.3.5:1001
1.2.3.4:2001
5.6.7.8
192
.2.3.5
80
1001
Assign addresses to machines behind same NAT
Usually in address block
192.168.0.0/16
Use port numbers to multiplex single addressSlide72
72
ForwardingSlide73
73
Scalability via Address
Aggregation
Provider is given 201.10.0.0/21 (201.10.0.x .. 201.10.7.x)
201.10.0.0/22
201.10.4.0/24
201.10.5.0/24
201.10.6.0/23
Provider
Routers in the rest of the Internet just need to know how to reach
201.10.0.0/21
. The provider can direct the IP packets to the appropriate
customer
.
Each customer
given smaller prefixSlide74
Global Picture74
201.10.0/21
Port 1 201.11.0/21Port 2
202/8
Port 4
……………..
201.10.0/22
Port
1
201.10.4/24
Port
2
201.10.5/24
Port 3201.10.6/23Port 4Router in Internet CoreRouter in ISP
Only /21 listed in core
/22, /23, /24 only listed in ISP’s routerSlide75
75
Aggregation Not Always Possible
201.10.0.0/21
201.10.0.0/22
201.10.4.0/24
201.10.5.0/24
201.10.6.0/23
Provider 1
Provider 2
Multi-homed
customer with 201.10.6.0/23 has two providers. Other parts of the Internet need to know how to reach these destinations through
both
providers.
/23 route must be globally visibleSlide76
Multihoming Global Picture76
201.10.0/21
Port 1 201.10.6/23Port 2201.11.0/21
Port
3
……………..
201.10.0/22
Port
1
201.10.4/24
Port
2
201.10.5/24
Port 3201.10.6/23Port 4Router in Internet Core
Router in ISP1
201.10.6/23
Port 1
201.11.0/21Port 2201.12.0/21Port 3201.13.0/21Port 4Router in ISP2Slide77
Simple Example 0** Port 1 100 Port 2 101 Port 1 11* Port 177Slide78
78
Prefix Tree
00*000
001
0
1
01*
010
011
0
1
11*
110
111
0
1
10*
100
101
0
1
0**
0
1
1**
0
1
***
0
1
P1
P2
P1
P1Slide79
79
More Compact Representation
10*100
0
1**
0
***
1
P2
P1
Record port associated with first match, and only over-ride when it matches another prefix during walk down tree
This is longest prefix match (LPM)
If you ever leave path, you are done, last matched prefix is answerSlide80
Forwarding OptimizationLPM requires fewest entries and fewest bits walked80Slide81
Longest Prefix Match Representation *** Port 1 100 Port 2If address matches both, then take longest match81Slide82
82
Example
1
0
1
0
1
0
1
Prefix destined for Provider 1
Prefix destined for Provider 2
No packet will match more than one prefix
A
ll paths reach a unique prefixSlide83
83
More Compact Representation
0
Prefix destined for Provider 1
Prefix destined for Provider 2Slide84
84
TransportSlide85
Role of Transport LayerProvide common end-to-end services for app layerDeal with network on behalf of applicationsDeal with applications on behalf of networksCould have been built into apps, but want common implementations to make app development easierSince TCP runs on end host, this is about software modularity, not overall network architecture85Slide86
86
TCP Header
Source port
Destination port
Sequence number
Acknowledgment
Advertised window
HdrLen
Flags
0
Checksum
Urgent pointer
Options (variable)
DataSlide87
ExamplePacket arrives:Seq: 2323Ack: 4001W=3000[no payload]Appropriate response?Seq: 4001, payload: 4001-8000Seq: 2001, payload: 2001-5000Seq: 4001, payload: 4001-5000Seq: 5001, payload: 5001-6000Seq: 8001, payload: 8001-900087Slide88
88
Advertised Window Limits RateS
ender can send no faster than W/RTT bytes/secIn ideal case, throughput = MIN [W/RTT, B]Where B is bottleneck on pathSlide89
89
Establishing a TCP Connection
Three-way handshake to establish connectionHost A sends a SYN (open; “synchronize sequence numbers”) to host BHost B returns a SYN acknowledgment (
SYN ACK
)
Host A sends an
ACK
to acknowledge the SYN ACK
SYN
SYN ACK
ACK
A
B
Data
Data
Each host tells its ISN to the other host.