Class 8 IP Forwarding Routing Theophilus Benson Based partly on lecture notes by Rodrigo Fonseca David Mazières Phil Levis John Jannotti Administrivia Midterm 1 Day after UNC game New proposed dates ID: 643540
Download Presentation The PPT/PDF document "CPS-356- Computer Networks" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
CPS-356- Computer NetworksClass 8: IP Forwarding+ Routing
Theophilus Benson
Based partly on lecture notes by Rodrigo Fonseca, David
Mazières
, Phil Levis, John
JannottiSlide2
Admini-strivia Midterm 1:
Day after UNC game: New proposed dates:02/24/2015
HW #1: Going Up Tomorrow on Website (due in a week 02/12/2015)Slide3
Today’s Lecture
ForwardingIP-Address/IP-Packet Format
Fragmentation
Debugging the network: ICMP
Getting IP-Address: ARP
+ DHCP
Routing
Intra-Domain Routing
Distance Vector Protocol
Loop Detection + AvoidanceSlide4
Format of IP AddressesClassed Addresses
Pros
Very simple to use and implement
Allows for hierarchical routing
Use first 3 bits to determine addresses class (A, B, C)
Based on class you know what bits to ignore
Cons
Wasteful allocation
Statically specify network and host portion of address
CIDR Addresses
Pros
Efficient allocation of resources
dynamically specify network and host portion of address
Cons
More complex to implement in hardwareSlide5
Format of IP Addresses
Classed Addresses (Static partitioning of Network/host portions)
Class A (8-bit prefix), B (16-bit), C (24-bit)
CIDR
(Dynamic partitioning of Network/hosts portions)
128.23.92.12
10000000
128.23.16.12/31
Specifies the prefix size: the number of bits in the network portion (
NetMask
)
11111111.11111111.11111111.11111110
10000000.00010111.00010000.00001100
Prefix size = 31 bits
Host size = 1 bit
32-31=
1
Only 2^1 hosts in the networkSlide6
Other CIDR Examples
128.23.16.12/32
11111111.11111111.11111111.00000000
10000000.00010111.00010000.00001100
128.23.16.12/24
11111111.11111111.11111111.11111111
10000000.00010111.00010000.00001100
Prefix size = 32 bits
Host size = 0 bit
32-32=0
Only 2^0 hosts in the network
Prefix size = 24 bits
Host size = 8 bits
32 – 24 = 8
Only 2^8 hosts in the networkSlide7
Where Does IP-Address Fit Into a packet?
Ethernet
IP
Destination MAC Address
Source MAC Address
Length
Destination IP Address
Source IP Address
Type
Options
Padding
TTL
Protocol
Hdr
checksum
V
Total Length
Frag
V
TOS
Identification
M
M
M
Src
Port
Seq
Number
Offset
Reserved
Dst
Port
Ack
Number
Window
Data (Payload)Slide8
IP v4 packet format
Destination IP Address
Source IP Address
Options
Total Length
TOS
Identification
Hdr
len
vers
TTL
Protocol
Hdr
Checksum
Padding
Fragment Offset
Forward based on destination addressSlide9
IP v4 packet format
Destination IP Address
Source IP Address
Options
Total Length
TOS
Identification
Hdr
len
vers
TTL
Protocol
Hdr
Checksum
Padding
Fragment Offset
Forward based on destination address
TTL = Time to Live
Prevents forwarding loops
Decremented at each hopSlide10
IP v4 packet format
Destination IP Address
Source IP Address
Options
Total Length
TOS
Identification
Hdr
len
vers
TTL
Protocol
Hdr
Checksum
Padding
Fragment Offset
Forward based on destination address
TTL = Time to Live
Prevents forwarding loops
Decremented at each hop
Cut large packets into smaller ones
E.g. from
E
thernet to ATM
From 1500B to 64B
MF: more fragments
DF: don’t fragment (return an error to the sender)
MF
D
FSlide11
IP v4 packet format
Destination IP Address
Source IP Address
Options
Total Length
TOS
Identification
Hdr
len
vers
TTL
Protocol
Hdr
Checksum
Padding
Fragment Offset
Forward based on destination address
TTL = Time to Live
Prevents forwarding loops
Decremented at each hop
Cut large packets into smaller ones
E.g. from
E
thernet to ATM
From 1500B to 64B
MF: more fragments
DF: don’t fragment (return an error to the sender)
Version = IPv4 or IPv6
MF
D
FSlide12
IP v4 packet format
Destination IP Address
Source IP Address
Options
Total Length
TOS
Identification
Hdr
len
vers
TTL
Protocol
Hdr
Checksum
Padding
Fragment Offset
Forward based on destination address
TTL = Time to Live
Prevents forwarding loops
Decremented at each hop
Cut large packets into smaller ones
E.g. from
E
thernet to ATM
From 1500B to 64B
MF: more fragments
DF: don’t fragment (return an error to the sender)
Version = IPv4 or IPv6
Protocol = TCP/UDP?Slide13
IP v4 packet format
Destination IP Address
Source IP Address
Options
Total Length
TOS
Identification
Hdr
len
vers
TTL
Protocol
Hdr
Checksum
Padding
Fragment Offset
Forward based on destination address
TTL = Time to Live
Prevents forwarding loops
Decremented at each hop
Cut large packets into smaller ones
E.g. from
E
thernet to ATM
From 1500B to 64B
MF: more fragments
DF: don’t fragment (return an error to the sender)
Version = IPv4 or IPv6
Protocol = TCP/UDP?
Header length == size of the header, which can vary because you can have an arbitrary number of options
Total length == length of header + payload Slide14
Today’s Lecture
ForwardingIP-Address/IP-Packet Format
Fragmentation
Network Error Messages
(Debugging):
ICMP
Getting IP-Address: ARP
+ DHCP
Routing
Intra-Domain Routing: RIPSlide15
Why Do you need to Fragment Packets?
Different networks have different MTUs.Router may need to fragment packets to allow them to cross different mediums
Le Theo Net
(ATM)
DukeNet
(Ethernet)
ATT
(Ethernet)
MTU=1500
MTU=1500
MTU=64Slide16
Implication of FragmentationIf a fragment is lost, must retransmit the whole packet!!!
Why?Fragmentation delays reassembly of packet until all fragments are received
Some people avoid fragmentation!!!!Slide17
What do Fragmented Packets look like?
Use ‘identification’, ‘fragment offset’ and ‘MF’ bit in IP header
Set the ‘MF’ bit
Use the same ‘Id’ for all fragments
Offset present position in original packet
1400 Bytes
Rest of header
Start of header
0
213
0
512 bytes
Rest of header
Start of header
0
213
1
512 bytes
Rest of header
Start of header
64
213
1
376 bytes
Rest of header
Start of header
128
213
0Slide18
Internet Control Message Protocol (ICMP)
Echo (ping)
Redirect
Destination unreachable (protocol, port, or host)
TTL exceeded
Checksum failed
Reassembly failed
Can’t fragment
Many ICMP messages include part of packet that triggered them
See http://www.iana.org/assignments/icmp-parametersSlide19
ICMP message formatSlide20
Example: Time Exceeded
Code usually 0 (TTL exceeded in transit)
Discussion: tracerouteSlide21
Example: Can’t FragmentSent if DF=1 and packet length > MTU
What can you use this for?Path MTU DiscoveryCan do binary search on packet sizes
But better: base algorithm on most common
MTUsSlide22
Today’s Lecture
ForwardingIP-Address/IP-Packet Format
Fragmentation
Debugging the network: ICMP
Getting IP-Address: ARP
+ DHCP
Routing
Intra-Domain Routing: RIPSlide23
How do you Make a Packet
Ethernet
IP
Destination MAC Address
Source MAC Address
Length
Destination IP Address
Source IP Address
Type
Options
Padding
TTL
Protocol
Hdr
checksum
V
Total Length
Frag
V
TOS
Identification
M
M
M
Src
Port
Seq
Number
Offset
Reserved
Dst
Port
Ack
Number
Window
Data (Payload)
Comes with your hardware
???????
DNS gives this to youSlide24
Obtaining Host IP Addresses - DHCP
Address must be assigned to each host by his network.Manually: Tedious and error-prone:
Automatically: Dynamic Host Configuration Protocol
Client: DHCP Discover to 255.255.255.255 (broadcast)
Server(s
): DHCP Offer to 255.255.255.255 (why broadcast?)
Client: choose offer, DHCP Request (broadcast, why?)
Server: DHCP ACK (again broadcast)
Result: IP-address, gateway,
netmask, DNS serverSlide25
Obtaining IP Addresses
Blocks of IP addresses allocated hierarchicallyISP obtains an address block, may subdivide
ISP: 128.35.16/20
10000000 00100011 0001
0000 00000000
Client 1: 128.35.16/22
10000000 00100011 000100
00 00000000
Client 2: 128.35.20/22
10000000 00100011 00010100 00000000Client 3: 128.35.24/21 10000000 00100011 00011000 00000000
Global allocation: ICANN, /8’s (ran out!)Regional registries: ARIN, RIPE, APNIC, LACNIC, AFRINICSlide26
How do you Make a Packet
Ethernet
IP
Destination MAC Address
Source MAC Address
Length
Destination IP Address
Source IP Address
Type
Options
Padding
TTL
Protocol
Hdr
checksum
V
Total Length
Frag
V
TOS
Identification
M
M
M
Src
Port
Seq
Number
Offset
Reserved
Dst
Port
Ack
Number
Window
Data (Payload)
Comes with your hardware
???????
DNS gives this to you
DHCPSlide27
What is the Destination Address?
If dest. is in your network (e.g. Alice to Bob)
Then use the Destination’s Ethernet address.
If
dest
. is not in your network
(
e.g
Alice to Google)
Then use the gateway router’s Ethernet address.The destination may use a different protocol
Le Theo Net
(ATM)
DukeNet
(Ethernet)
Alice
Bob
Google
Ethernet
Ethernet
ATM
ATM
EthernetSlide28
How do you find this destination address?
Check local ARP tableIf found use it. (DONE!)
Start sending packets!
DukeNet
(Ethernet)
Alice
Bob
Ethernet
Ethernet
ATM
EthernetSlide29
How do you find this destination address?
Check local ARP tableIf found use it.
(DONE!)
Compare my IP with
dest
IP
In same network?
Then ARP request for
Dest
IPIn different Networks?Then ARP request for Router IP
Alice: 128.23.16.12/30
Google: 128.16.16.16
Bob: 128.23.16.14
DukeNet
:
128.23.16.12
/
30
4 addresses
128.23.16.12– 128.23.16.16
Alice->Bob: same network
Alice->Google: diff networkSlide30
How do you find this destination address?
Alice: 128.23.16.12/30
Google: 128.16.16.16
Bob: 128.23.16.14
DukeNet
:
128.23.16.12
/
30
4 addresses
128.23.16.12– 128.23.16.16
Alice->Bob: same network
Alice->Google: diff network
DukeNet
(Ethernet)
Alice
Bob
Ethernet
Ethernet
ATM
EthernetSlide31
How ARP works.
DukeNet
(Ethernet)
Alice
Bob
Ethernet
Ethernet
ATM
Ethernet
I am:
128.23.16.12
Who is IP:
128.23.16.14Slide32
How ARP works.
DukeNet
(Ethernet)
Alice
Bob
Ethernet
Ethernet
ATM
Ethernet
I am:
128.23.16.12
Who is IP:
128.23.16.14
Now I know who:
128.23.16.12 is!
Now I know who:
128.23.16.12 is!Slide33
How ARP works.
DukeNet
(Ethernet)
Alice
Bob
Ethernet
Ethernet
ATM
Ethernet
Now I know who:
128.23.16.14 is!
I am:
128.23.16.14
MacAdd
:
02………..
Now I know who:
128.23.16.14 is!Slide34
How ARP works.
DukeNet
(Ethernet)
Alice
Bob
Ethernet
Ethernet
ATM
Ethernet
I am:
128.23.16.12
Who is IP:
128.23.16.14
Now I know who:
128.23.16.12 is!
Now I know who:
128.23.16.14 is!
Now I know who:
128.23.16.12 is!
I am:
128.23.16.14
MacAdd
:
02………..
Now I know who:
128.23.16.14 is!Slide35
ARP Ethernet frame format
Why include source hardware address? Slide36
How do you Make a Packet
Ethernet
IP
Destination MAC Address
Source MAC Address
Length
Destination IP Address
Source IP Address
Type
Options
Padding
TTL
Protocol
Hdr
checksum
V
Total Length
Frag
V
TOS
Identification
M
M
M
Src
Port
Seq
Number
Offset
Reserved
Dst
Port
Ack
Number
Window
Data (Payload)
Comes with your hardware
ARP
DNS gives this to you
DHCPSlide37
Today’s Lecture
ForwardingIP-Address/IP-Packet Format
Fragmentation
Debugging the network: ICMP
Getting IP-Address: ARP
+ DHCP
Routing
Intra-Domain Routing
Distance Vector Protocol
Loop Detection +
AvoidanceSlide38
Routing
Routing is the process of updating forwarding tablesRouters exchange messages about routers or networks they can reachGoal: find optimal route for every destination
… or maybe a good route, or
any
route (depending on scale)
Challenges
Dynamic topology
Decentralized
ScaleSlide39
Scaling Issues
Every router must be able to forward based on any destination IP address
Given address, it needs to know next hop
Naïve: one entry per address
There would be 10
8
entries!
Solutions
Hierarchy (many examples)
Address aggregationAddress allocation is very important (should mirror topology)Default routesSlide40
IP Connectivity
For each destination address, must either:Have prefix mapped to next hop in forwarding table
Know “smarter router” – default for unknown prefixes
Route using longest prefix match, default is prefix 0.0.0.0/0
Core routers know everything – no default
Manage using notion of
Autonomous System
(AS)Slide41
Internet structure, 1990
Several independent organizations
Hierarchical structure with single backboneSlide42
Internet structure, todayMultiple backbones, more arbitrary structureSlide43
Autonomous SystemsCorrespond to an administrative domain
AS’s reflect organization of the InternetE.g., DukeNet
, large company, etc.
Identified by a 16-bit number
AS are also called ISP
ISP = Internet Service ProvidersSlide44
Le Theo Net
DukeNet
ATT
AS’s
choose their own local routing
algorithm
How should A,B,C,D do routing?
AS’s
want to set policies about non-local
routing
Should
DukeNet
use Link 1 or 2 to ATT?
AS’s
need not reveal internal topology of their
network
That Duke Net has 4 routers
A
D
C
B
Lnk
1
Lnk2Slide45
Inter and Intra-domain routing
Routing organized in two levelsIntra-domain routingComplete knowledge, strive for
optimal
paths
Scale to ~100 networks
Today
Inter-domain routing
Aggregated knowledge, scale to Internet
Dominated by
policyE.g., route through X, unless X is unavailable, then route through Y. Never route traffic from X to Y.Policies reflect business agreements, can get complex
Next lectureSlide46
Le Theo Net
DukeNet
ATT
A
D
C
B
Lnk
1
Lnk2
Intradomain
:
Routing inside
DukeNET
Interdomain
:
Routing across
DukeNet
, ATT,
TheoNetSlide47
Today’s Lecture
ForwardingIP-Address/IP-Packet Format
Fragmentation
Debugging the network: ICMP
Getting IP-Address: ARP
+ DHCP
Routing
Intra-Domain Routing
Distance Vector Protocol
Loop Detection +
AvoidanceSlide48
Network as a graph
Nodes are routersAssign cost to each edge
Can be based on latency, b/w, queue length, …
Problem: find lowest-cost path between nodes
Each node individually computes routesSlide49
Basic Algorithms
Two classes of intra-domain routing algorithmsDistance Vector (Bellman-Ford SP Algorithm)Requires only local state
Harder to debug
Can suffer from loops
Link State (
Djikstra
-Prim SP Algorithm)
Each node has global view of the network
Simpler to debug
Requires global stateSlide50
Distance Vector
Local routing algorithmEach node maintains a set of triples
<
Destination, Cost,
NextHop
>
Exchange updates
with neighbors
Periodically (seconds to minutes)
Whenever table changes (triggered update)Each update is a list of pairs
<Destination, Cost>
Update local table if receive a “better” route
Smaller cost
Refresh existing routes, delete if time outSlide51
DV Example
B only exchanges information with
A and CSlide52
Distance Vector
Local routing algorithmEach node maintains a set of triples
<
Destination, Cost,
NextHop
>
Exchange updates
with neighbors
Periodically (seconds to minutes)
Whenever table changes (triggered update)Each update is a list of pairs<Destination, Cost
>Update local table if receive a “better” routeSmaller cost
Refresh existing routes, delete if time outSlide53
DV Example
B only exchanges information with
A and C
Destination
Cost
Next Hop
A
1
A
C
1
C
D
infinity
--
E
infinity
--
F
infinity
--
G
infinity
--
B’s routing table
@ time = 0
D, 1
A, 1Slide54
DV Example
B only exchanges information with
A and C
Destination
Cost
Next Hop
A
1
A
C
1
C
D
2
C
E
infinity
--
F
infinity
--
G
infinity
--
B’s routing table
@ time = 0
D, 1
A, 1Slide55
Distance Vector
Local routing algorithmEach node maintains a set of triples
<
Destination, Cost,
NextHop
>
Exchange updates with neighbors
Periodically (seconds to minutes)
Whenever table changes (
triggered update)Each update is a list of pairs<Destination, Cost>
Update local table if receive a “better” routeSmaller costRefresh existing routes, delete if time outSlide56
Calculating the best path
Bellman-Ford equationLet:
D
b
(d)
denote the current best distance from b to
d
C(
b,
c) denote the cost of a link from a to bThen D
b(d) = mind(
D
b
(d)
,
c(
b
,c) + Dc
(d))Routing messages contain DD is any additive metric
e.g, number of hops, queue length, delaylog can convert multiplicative metric into an additive one (e.g., probability of failure)
C’s update
D, 1
A, 1
Destination
CostNext Hop
A1
AC
1
CDinfinity
--
Einfinite
--
F
infinite
--
G
infinite
--
D
b
(d)
= min
d
(
infinity
,
1 + 1)
D
b
(A)
=
min
A
(
1
,
1 + 1)Slide57
Calculating the best path
Bellman-Ford equationLet:
D
b
(d)
denote the current best distance from b to
d
C(
b,
c) denote the cost of a link from a to bThen D
b(d) = mind(
D
b
(d)
,
c(
b
,c) + Dc
(d))Routing messages contain DD is any additive metrice.g, number of hops, queue length, delay
asdfSlide58
DV Example
Destination
Cost
Next Hop
A
1
A
C
1
C
D
2
C
E
2
A
F
2
A
G
3
A
B’s routing tableSlide59
G, 1, G
F-G fails
F sets distance to G to
infinity, propagates
A
sets distance to G to infinity
A
receives periodic update from C with 2-hop path to G
A
sets distance to G to 3 and propagatesF sets distance to G to 4, through A
G, ∞, -
G, 4, A
Adapting to Failures
G, 2, F
G, 2, D
G, 3, D
G, 3, A
G, 1, G
G, ∞,-
G, 3,C
G, 4, ASlide60
Count-to-Infinity
Link from A to E failsA advertises distance of infinity to EB and C advertise a distance of 2 to E
B decides it can reach E in 3 hops through C
A decides it can reach E in 4 hops through B
C decides it can reach E in 5 hops through A, …
When does this stop?Slide61
Good news travels fast
A
B
C
4
1
10
1
A decrease in link cost has
to be fresh information
Network converges at most in
O(diameter
) stepsSlide62
Bad news travels slowly
A
B
C
4
1
10
An increase in cost may cause confusion with old information, may form loops
Consider routes to A
Initially, B:A,4,A; C:A,5,B
Then B:A,12,A, selects C as next hop -> B:A,6,C
C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C;
C finally chooses C:A,10,A, and B -> A,11,C!
A
4
A
C
1
C
A
5
B
B
1
B
B
4
B
C
5
BSlide63
Bad news travels slowly
A
B
C
4
1
10
12
An increase in cost may cause confusion with old information, may form loops
Consider routes to A
Initially, B:A,4,A; C:A,5,B
Then B:A,12,A, selects C as next hop -> B:A,6,C
C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C;
C finally chooses C:A,10,A, and B -> A,11,C!
A
6
C
C
1
C
A
5
B
B
1
B
B
11
C
C
10
C
A
6
CSlide64
Bad news travels slowly
A
B
C
4
1
10
12
An increase in cost may cause confusion with old information, may form loops
Consider routes to A
Initially, B:A,4,A; C:A,5,B
Then B:A,12,A, selects C as next hop -> B:A,6,C
C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C;
C finally chooses
C:A,10,A
, and B -> A,11,C!
A
7
C
C
1
C
A
6
B
B
1
B
B
11
C
C
10
CSlide65
Bad news travels slowly
A
B
C
4
1
10
12
An increase in cost may cause confusion with old information, may form loops
Consider routes to A
Initially, B:A,4,A; C:A,5,B
Then B:A,12,A, selects C as next hop -> B:A,6,C
C -> A,7,B; B -> A,8,C; C -> A,9,B; B -> A,10,C;
C finally chooses
C:A,10,A
, and B -> A,11,C!
A
11
C
C
1
C
A
10
C
B
1
B
B
11
C
C
10
CSlide66
How to avoid loopsIP TTL field prevents a packet from living forever
Does not repair a loopSimple approach: consider a small cost
n
(e.g., 16) to be infinity
After
n
rounds decide node is unavailable
But rounds can be long, this takes time
Problem: distance vector based only on local informationSlide67
Bad news travels slowly
A
B
C
4
1
10
12
A
11
C
C
1
C
A
10
C
B
1
B
B
11
C
C
10
C
Why did it take a while to converge?Slide68
Better loop avoidance
Split HorizonWhen sending updates to node A, don’t include routes you learned from APrevents B and C from sending cost 2 to A
Split Horizon with Poison Reverse
Rather than not advertising routes learned from A, explicitly include cost of ∞.
Faster to break out of loops, but increases advertisement sizesSlide69
Warning Split horizon/split horizon with poison reverse only help between two nodes
Can still get loop with three nodes involvedMight need to delay advertising routes after changes, but affects convergence timeSlide70
Today’s Lecture
ForwardingIP-Address/IP-Packet Format
Fragmentation
Network Error Messages (Debugging): ICMP
Getting IP-Address: ARP
+ DHCP
Routing
Intra-Domain Routing: RIP
Next class:
Intra-
Domain Routing: OSPF, OSPF v RIP