PhD Dissertation Defense Hao Wang Y Richard Yang Joan Feigenbaum Jennifer Rexford Princeton Avi Silberschatz Advisor Committee Efficient and Robust Internet Traffic Engineering Y Richard Yang ID: 675792
Download Presentation The PPT/PDF document "Efficient and Robust Traffic Engineering..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Efficient and Robust Traffic Engineering in a Dynamic Environment
Ph.D. Dissertation DefenseHao Wang
Y. Richard YangJoan FeigenbaumJennifer Rexford (Princeton)Avi Silberschatz
Advisor:
Committee:Slide2
Efficient and Robust
Internet Traffic EngineeringY. Richard Yang
Nov. 2010Based on Slides of Hao Wang’s Defense and R3 SIGCOMM PresentationsSlide3
11/16/2010
3
Collaborators (2003
–
2010)
Richard
Alimi
Alex Gerber (AT&T Research)
Albert Greenberg (Microsoft Research)
Paul H. Liu
Zheng
Ma
Lili
Qiu
(UT Austin)
Jia
Wang (AT&T Research)
Ye Wang
Haiyong
Xie
Yin
Zhang (UT Austin)Slide4
11/16/2010
4
Internet as a Social Infrastructure
Growing fast
IPTV
2007-2011
:
Increase
10
times
General IP traffic trend
Applications with more stringent requirements, e.g.,
IPTV / VoDVoIP, telecommuting/video-conferencing
200920102011201220132014CAGRConsumer11,60216,53423,75032,54543,11755,80137%Business3,0833,8624,7405,6976,8018,10321%
Source: Cisco VNI 2010. Unit: PB/monthSlide5
11/16/2010
5
Internet Routing
Determines paths from source to destination
For applications
Delay, loss, delay jitter, …
For ISPs
Efficiency of resource usage
Capability of failure recovery
Scalability
Routing is becoming more important
High cost of network assets
Highly competitive nature of the ISP marketSlide6
11/16/2010
6
Traffic Engineering (TE)
Objective:
efficient & reliable routing
Input: Topology
Traffic
ISP Objective
App. Requirement
A
D
B
C
F
E
0.8
0.2
Output: RoutingSlide7
11/16/2010
7
Challenges in a Dynamic Environment
Traffic fluctuates
Topology changes
Multiple ISP objectives
Traffic
Engineering
Traffic
Topology
ISP Objective
App. Requirement
RoutingSlide8
11/16/2010
8
Challenge due to Traffic Dynamics
Traffic fluctuates, e.g.,
Diurnal patterns
Worms/viruses, DoS attacks, flash crowds
BGP routing changes, load balancing by multihomed customers, TE by peers, failures in other networks
Implications: c
an lead to long delay, high loss, reduced throughputSlide9
11/16/2010
9
Challenge due to Topology Dynamics
Topology changes, e.g.,
Maintenance, failures,
mis
-configurations
Accidents / disasters
e.g., 675,000 excavation accidents in per year [Common Ground Alliance] Network cable cuts every few days …
Implications: substantial disruption to Internet
E.g., two link failures in Sprint led to disconnection of millions of wireless users, partition of many office networks [Sprint]Slide10
11/16/2010
10
Challenge due to ISP Objectives
Network traffic & topology relatively stable most of the time
Unexpected scenarios do happen
Many unexpected scenarios happen when service is most valuable!
disaster, flash crowds,…
unhandled dynamics =>
violation of SLA
customers can remember bad experiences very well
How to balance between:
Common-case performance
Unexpected-case performanceSlide11
11/16/2010
11
11/16/2010
11
Summary of
Project Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Traffic EngineeringSlide12
11/16/2010
12
11/16/2010
12
Summary of Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Resilient Traffic Engineering Service
Interdomain Reliability ServiceSlide13
11/16/2010
13
11/16/2010
13
Summary of Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Resilient Traffic Engineering Service
Interdomain
Reliability
Service
Resilient Routing Reconfiguration
Traffic VariationsSlide14
11/16/2010
14
11/16/2010
14
What
’
s Next
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Resilient Traffic Engineering Service
Interdomain Reliability Service
Resilient Routing Reconfiguration
Traffic VariationsSlide15
11/16/2010
15
Challenge: Unpredictable Traffic
Internet traffic is
highly unpredictable!
Can be relatively stable most of the time
…
However, usually contains spikes that ramp up extremely quickly
we saw traffic spikes in the traces of several networks
Unpredictable traffic variations have been observed and studied by other researchers
[Teixeira et al. ’04, Uhlig & Bonaventure ’02, Xu et al. ’05 ]
Confirmed by operators of several large networks via email surveySlide16
11/16/2010
16
Previous TE Approaches
Prediction-based TE
Examples:
Off-line:
Single predicted TM
[ Sharad et al. ’05 ]
Multiple predicted TMs
[ Zhang et al. ’05 ]
On-line: MATE
[ Elwalid et al. ’01 ]
& TeXCP [ Kandula et al. ’05 ] Pro: Works great when traffic is predictable
Con: May pay a high penalty when real traffic deviates substantially from the predictionSlide17
11/16/2010
17
Previous TE Approaches (cont’d)
Traffic-oblivious routing
Examples:
Oblivious routing
[ Racke ’02, Azar et al. ’03, Applegate et al. ’03 ]
Valiant load-balancing
[ Kodialam et al. ’05, Zhang & McKeown ’04 ]
Pro:
Provides worst-case performance bounds
Con:
May be sub-optimal for normal traffic
The optimal oblivious ratio of several real network topologies studied in [Applegate et al ’03] is ~2An over-provision rate of ~2 is still too highSlide18
11/16/2010
18
Our Approach: COPE
Objectives
Good common-case performance
Tight worst-case guarantee
Our approach
Common-case optimization
Optimize for predicted, normal traffic load
Penalty envelope
Bound worst-case performance for the unexpectedSlide19
11/16/2010
19
COPE Illustrated
C
ommon-case
O
ptimization with
P
enalty
E
nvelope
C
X
Bound
for set X
Optimize
for set C
min
f
max
d
C
P
C
(f, d)
s.t. (1) f is a routing
(2)
xX:
P
X
(f, x)
PE
C:
common-case (predicted) TMs
X:
all TMs of interest
P
C
(f,d):
common-case penalty function
P
X
(f,x):
worst-case penalty function
PE:
penalty envelopeSlide20
11/16/2010
20
COPE in Perspective
Prediction-based TE
Best common-case
+
Poor worst-case
Oblivious Routing
Poor common-case
+
Best worst-case
COPE
Good common-case
+
Bounded worst-case
Spectrum of TE with unpredictable traffic
Position controllable by penalty envelope
The worst unexpected case too unlikely to occur
Too wasteful to
“
optimize
”
for
the worst-case (at the cost of poor common-case performance)
There are enough unexpected cases
Penalty envelope is requiredSlide21
11/16/2010
21
Model: More Details
Network topology: graph
G = (V,E)
V
: set of routers
E
: set of network links, link (i,j) capacity c
ij
Traffic matrices (TMs)
A TM is a set of demands:
d = { d
ab | a,b V }
dab: traffic demand from a to bLink-based routingf = { fab(i,j) | a,b V, (i,j) E }fab(i,j) : the fraction of demand from a to b (i.e., dab) that is routed through link (i,j) Slide22
11/16/2010
22
Routing Performance Metric
Maximum Link Utilization
(MLU): Slide23
11/16/2010
23
Penalty Envelope
–
A Case Study
X: the set of TMs with access capacity constraints
P
X
(f, x): MLU of routing f on demand x
PE: upper-bound on MLU
Direct formulation of
xX:
P
X(f, x) PE:
infinite # of casesSlide24
11/16/2010
24
Slave LP
Test whether constraint satisfied by solving the following LP for each link:
Constraint satisfied if slave LP objective r
*
≤ PE
Slave LP can serve as separation oracle in Ellipsoid Method
still not fast enough
Can we use Interior-Point Method?Slide25
11/16/2010
25
Dual of Slave LP
Introduce dual variables for the access capacity constraints:
Dual function of the slave LP:Slide26
11/16/2010
26
Dual of Slave LP
Since no duality gap, r
*
≤ PE iff
Polynomial # of constraints, can use Interior-Point MethodSlide27
11/16/2010
27
Evaluations
Evaluated algorithms
COPE:
COPE with
P
C
(f,d) = PR(f,d)
(i.e. performance ratio)
COPE-MLU:
COPE with
P
C(f,d) = U(f,d) (i.e. max link utilization)Oblivious routing:
minf maxxPR(f,x) ( COPE with =1)Dynamic: optimize routing for TM in previous intervalPeak: peak interval of previous day + prev/same days in last weekMulti: all intervals in previous day + prev/same days in last weekOptimal: requires an oracleDatasetUS-ISPhourly PoP-level TMs for a tier-1 ISP (1 month in 2005)Optimal oblivious ratio: 2.045; default penalty envelope: 2.5Abilene5-min router-level TMs on Abilene (6 months: Mar – Sep. 2004)Optimal oblivious ratio: 1.853; default penalty envelope: 2.0Slide28
11/16/2010
28
US-ISP: Performance Ratio
Common cases: COPE is close to dynamic and much better than others
Unexpected cases: COPE beats even oblivious and is much better than others
Performance Ratio =
MLU of the algorithm
-------------------------------
MLU of optimal routingSlide29
11/16/2010
29
Abilene: Performance Ratio
Common cases: COPE is close to dynamic and much better than others
Unexpected cases: COPE is close to oblivious and much better than othersSlide30
11/16/2010
30
Abilene: MLU in Unexpected Cases
Unexpected cases: COPE is close to oblivious and much better than othersSlide31
11/16/2010
31
US-ISP: Sensitivity to PE
Even a small margin in PE can significantly improve the common-case performanceSlide32
11/16/2010
32
COPE Summary
COPE
=
C
ommon-case
O
ptimization with
P
enalty
E
nvelopeCOPE works!Common cases: close to optimal; much better than oblivious routing and prediction-based TE with comparable overheadUnexpected cases: much better than prediction-based TE, and sometimes may beat oblivious routingEven a small margin in PE improves common-case performance a lot
COPE as an optimization paradigm also applies to many other contexts, e.g., interdomainSlide33
Remaining Issues: Topology Dynamics
COPE handles traffic variations, not topology variations, but failures are common in operational IP networks Accidents, attacks, hardware failure, misconfig, maintenanceMultiple unexpected failures may overlape.g., concurrent fiber cuts in Sprint (2006)
Planned maintenance affects multiple network elementsMay overlap with unexpected failures (e.g. due to inaccurate SRLG)Increasingly stringent requirement on reliabilityVoIP, video conferencing, gaming, mission-critical apps, etc. SLA has teeth violation directly affects ISP revenueNeed resiliency: network should recover quickly & smoothly from one or multiple overlapping failuresSlide34
Challenge: Topology Uncertainty
Number of failure scenarios quickly explodes500-link network, 3-link failures: > 20,000,000!
Difficult to optimize routing to avoid congestion under all possible failure scenariosBrute-force failure enumeration is clearly infeasibleExisting methods handle only 100s of topologiesDifficult to install fast routingPreconfigure 20,000,000 backup routes on routers?Slide35
Focus exclusively on
reachabilitye.g., FRR, FCP (Failure Carrying Packets), Path SplicingMay suffer from congestion and unpredictable performance
congestion mostly caused by rerouting under failures [Iyer et al.]multiple network element failures have domino effect on FRR rerouting, resulting in network instability [N. So & H. Huang]Only consider a small subset of failurese.g., single-link failures [D. Applegate et al.]Insufficient for demanding SLAsOnline routing re-optimization after failuresToo slow cannot support fast reroutingExisting Approaches & Limitations35Slide36
R3:
Resilient Routing Reconfiguration
A novel link-based routing protection schemerequiring no enumeration of failure scenariosprovably congestion-free for all up-to-F link failuresefficient w.r.t. router processing/storage overheadflexible in supporting diverse practical requirementsSlide37
Goal: congestion-free rerouting under up-to-F link failures
Input: topology G(V,
E), link capacity ce, traffic demand ddab: traffic demand from router a to router bOutput: base routing r, protection routing prab(e): fraction of dab carried by link ep
l
(
e
)
:
(link-based) fast rerouting for link l
Problem Formulation
37
5
10
10
10
10
capacity
d
ab
=6
a
b
d
c
r
ab
(
ab
)=1
p
ab
(
ac
)=1
p
ab
(
cb
)=1Slide38
From Topology Uncertainty
to Traffic UncertaintyInstead of optimizing for original traffic demand
on all possible topologies under failuresR3 optimizes protection routing for a set of traffic demands on the original topologyRerouting virtual demand set captures the effect of failures on amount of rerouted trafficProtection routing on original topology can be easily reconfigured for use after failure occurs38Slide39
Failure scenario (f)
rerouted traffic (x)
Rerouted traffic under all possible up-to-F-link failure scenarios (independ of r):XF = { x | 0 ≤ xl ≤ cl
,
Σ
(
x
l
/
c
l
) ≤
F
} (convex combination)
Rerouting Virtual Demand Set39
4/5
0/10
0/10
4/10
2/10
rerouted traffic
x
l
after link
l
fails = base load on
l
given
r
(
r
is congestion-free
x
l
≤
c
l
)
load
/
capacity
a
b
c
d
Failure scenario
Rerouted traffic
Upper bound of rerouted traffic
ac
fails
x
ac
=
4
x
ac
≤
5
(
c
ac
)
ab
fails
x
ab
=
2
x
ab
≤
10
(
c
ab
)Slide40
R3 Overview
Offline precomputationPlan (r,p) together for original demand d
plus rerouting virtual demand x on original topology G(V,E) to minimize congestionp may “use” links that will later failOnline reconfigurationConvert and use p for fast rerouting after failures40Slide41
Compute (r,
p) to minimize MLU (Max Link Utilization) for original demand d + rerouting demand x ∈ XF r carries
d, p carries x∈XFmin(r,p) MLUs.t. [1] r is a routing, p is a routing; [2] ∀x∈XF, ∀e: [ ∑a,b∈Vdabr
ab
(
e
)
+
∑l∈
E
x
l
p
l(e) ] / ce ≤ MLUChallenge: [2] has infinite number of constraints
Solution: apply LP duality a polynomial # of constraintsOffline Precomputation41Original trafficRerouting trafficSlide42
Step 1: Fast rerouting after
ac fails:Precomputed p for ac:
pac(ac)=1/3, pac(ab)=1/3, pac(ad)=1/3ac fails fast reroute using pac \ pac(ac)equivalent to a rescaled pac:ξac(ac)=0
,
ξ
ac
(
ab
)=1/2, ξac(ad
)=1/2
link
l
fails ξl(e)=pl(
e)/(1-pl(l))Online Reconfiguration42
1/3
1/3
1/3
2/3
1/3
p
ac
(
e
)
a
b
c
d
0
1/2
1/2
1/2
1
ξ
ac
(
e
)Slide43
Online Reconfiguration (Cont.)
Step 2: Reconfigure p after failure of acCurrent p for
ab: pab(ac)=1/2, pab(ad)=1/2ac fails 1/2 need to be “detoured” using ξac pab(ac)=0, pab(ad)=3/4, pab(ab)=1/4link l fails
∀
l’
,
p
l’
(e) = pl’
(
e
)+
p
l’(l) ξl(e)
43
1/2
1/2
1/2
1/2
Apply detour
ξ
ac
on every
protection routing (for other
links) that is using
ac
p
ab
(
e
)
a
b
c
d
0
3/4
3/4
0
0
1/4Slide44
R3 Guarantees
Sufficient condition for congestion-freeif ∃(r,p)
s.t. MLU ≤ 1 under d+XF no congestion under any failure involving up to F linksNecessary condition under single link failureif there exists a protection routing guarantees no congestion under any single-link failure scenario ∃(r,p) s.t. MLU ≤ 1 under d+X
1
Adding superset of rerouted traffic to original demand is not so wasteful
Open problem: is R3 optimal for >1 link failures?
R3 online reconfiguration is
order independent
of multiple
failures
44Slide45
Intuition behind R3
Plan rerouting on the single original topology
Avoid enumeration of topologies (failure scenarios)Compute (r,p) to gurantee congestion-freefor d + x∈XF on GPuzzlesAdd rerouted traffic before it appearsXF = { x | 0 ≤ x
l
≤
c
l
, Σ(x
l
/
c
l
) ≤ F } superset of rerouted traffic under failuresUse network resource that will disappearProtection routing p uses links that will fail
45
By doing two
counter-intuitive
things, R3 achieves:
Congestion-free under multiple link failures w/o enumeration of failure scenarios
Optimal for single link failure
Fast rerouting
Not
real traffic demand!
Topology with links that will
fail
!Slide46
R3 Extensions
Fixed base routingr can be given (e.g., as an outcome of OSPF)Trade-off between no-failure and failure protectionAdd penalty envelope
β(≥1) to bound no-failure performanceTrade-off between MLU and end-to-end delayAdd envelope γ(≥1) to bound end-to-end path delayPrioritized traffic protectionAssociate different protection levels to traffic with different prioritiesRealistic failure scenariosShared Risk Link Group, Maintenance Link GroupTraffic variationsOptimize (r,p) for d ∈ D + x ∈ XF
46Slide47
Evaluation Methodology
Network TopologyReal: Abilene, US-ISP (PoP-level)Rocketfuel: Level-3, SBC, and UUNet
Synthetic: GT-ITMTraffic DemandReal: Abilene, US-ISP Synthetic: gravity modelFailure ScenariosAbilene: all failure scenarios with up to 3 physical linksUS-ISP: maintenance events (6 months) + all 1-link, 2-link failures + ~1100 sampled 3-link failuresEnumeration only needed for evaluation, not for R347Slide48
Evaluation Methodology (cont.)
R3 vs other rerouting schemesOSPF+R3: add R3 rerouting to OSPFMPLS-ff+R3: “ideal” R3 (flow-based base routing)
OSPF+opt : benchmark, optimal rerouting (based on OSPF)OSPF+CSPF-detour: commonly usedOSPF+recon: ideal OSPF reconvergenceFCP: Failure Carrying Packets [Lakshminarayanan et al.]PathSplice: Path Splicing (k=10,a=0,b=3) [Motivala et al.]Performance metricsMLU (Maximum Link Utilization) measures congestionlowerbetterperformance ratio = MLU of the algorithm / MLU of optimal routing for the changed topology (corresponding to the failure scenario)always ≥1, closer to 1better48Slide49
US-ISP Single Failure
49
R3 achieves near-optimal performance (
R3
);
R3 out-performs other schemes significantly.Slide50
US-ISP Multiple Failures
50
R3 consistently out-performs other schemes by at least 50%
all two-link failure scenarios
sampled three-link failure scenariosSlide51
US-ISP No Failure: Penalty Envelope
51
R3 near optimal under no failures with 10% penalty envelopeSlide52
Experiments using Implementation
R3 Linux software router implementationBased on Linux kernel 2.6.25 + Linux MPLSImplement flow-based fast rerouting and efficient R3 online reconfiguration by extending Linux MPLSAbilene topology emulated on Emulab3 physical link failures (6 directed link failures)
52Slide53
R3 vs OSPF-recon: Link Utilization
53
R3 out-performs OSPF+recon by a factor of ~3Slide54
Precomputation Complexity
Profile computation time (in seconds) Computer: 2.33 GHz CPU, 4 GB memory54
Offline precomputation time < 36 minutes
(operational topologies, < 17 minutes)
Computation time stable with higher #failures.Slide55
Linux Implementation
MPLS-ff software routerBased on Linux kernel 2.6.25 + Linux MPLSImplement flow-based routing for efficient R3 online reconfigurationExtend MPLS FWD (forward) data structure to enable per hop traffic splitting for each flowFailure detection and responseDetection using Ethernet monitoring in kernelIn operational networks, detection can be conservative (for SLRG, MLG)
Notification using ICMP 42 floodingRequires reachability under failuresRerouting traffic by MPLS-ff label stackingOnline reconfiguration of traffic splitting ratios (locally at each router)55Slide56
56
MPLS-ff
R3 uses
flow-based
traffic splitting (for each OD pair)
e.g
. p
newy
wash
(chic
hous)=0.2
means
link chic
hous will carry 20% traffic originally carried by newy
wash when newy
wash fails
Current MPLS routers only support
path-based
traffic splitting
Traffic load on equal-cost LSPs is proportional to requested bandwidth by each LSP
Juniper J-, M-, T- series and Cisco 7200, 7500, 12000 series
e.g. multiple paths from newy to wash can be setup to protect link newy
wash
56
56Slide57
57
MPLS-ff
Convert flow-based to path-based routing
e.g., using flow decomposition algorithm
-- [Zheng et al]
Expensive to implement R3 online reconfiguration
Need to recompute and signal LSPs after each failure
Extend MPLS to support flow-based routing
MPLS-ff
Enable next-hop traffic splitting ratios for each flow
flow: traffic originally carried by a protected link
57
57Slide58
58
MPLS-ff
MPLS FWD data structure
Extended to support
multiple
Next Hop Label Forwarding Entry (NHLFEs)
One NHLFE specifying one neighbor
One Next-hop Splitting Ratio for each NHLFE
Label 200 FWD:
NHLFE R2: 50%
NHLFE R3: 50%
NHLFE R4: 0%
R1
R2
R4
R3
58Slide59
59
MPLS-ff
Implement Next-hop Splitting Ratio
router i, next-hop j, protected link (a,b):
Packet hashing
Same hash value for packets in the same TCP flow
Independent hash values on different routers for any particular TCP flow
Hash of (
packet flow fields + router id
)
Hash = f(src, dst,
srcPort, dstPort)
FWD to j if 40<Hash<64
all packets from i: 40<Hash<64!
(skewed hash value distribution)
i
j
59Slide60
60
Failure Response
Failure Detection and Notification
detection: Ethernet monitoring in Linux kernel
notification: ICMP 42 flood
Failure Response
MPLS-ff label stacking
Protection Routing Update
R3 online reconfiguration
requires a local copy of
p
on each router
60Slide61
Router Storage Overhead
Estimate maximum router storage overhead for our implementationILM (MPLS incoming label mapping) and NHLFE (MPLS next-hop label forwarding entry) are required to implement R3 protection routing
61
Modest router storage overhead:
FIB < 300 KB, RIB < 20 MB,
#ILM < 460, #NHLFE < 2,500Slide62
62
R3 Linux Implementation: Flow
RTT (Denver-Los Angeles)
R3 implementation achieves smooth routing protection
62Slide63
63
R3: OD pair throughput
Traffic demand is carried by R3
under multiple link failure scenarios
63Slide64
Summary
R3: Resilient Routing Reconfiguration
Provably congestion-free guarantee under multiple failuresKey idea: convert topology uncertainty to traffic uncertaintyoffline precomputation + online reconfigurationFlexible extensions to support practical requirementsTrace-driven simulationR3 near optimal (>50% better than existing approaches)Linux implementationFeasibility and efficiency of R3 in real networks64Slide65
11/16/2010
65
11/16/2010
65
What
’
s Next
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Common-case Optimization with Penalty Envelope
Interdomain Reliability Service
Resilient Routing Reconfiguration
Traffic VariationsSlide66
11/16/2010
66
“
Any future Internet should attain the
highest possible level of availability
, so that it can be used for
mission-critical activities
, and it can serve the nation
in times of crisis
”
- GENI, 2006Slide67
11/16/2010
67
“
The 3 elements which carriers are most concerned about when deploying communication services are:
Network
reliability
Network
usability
Network
fault processing capabilities
”
Telemark, 2006
The top 3 all belong to reliability!Slide68
11/16/2010
68
Reliability Needs Redundancy
Over-provisioning of bandwidth
Diversity of physical connectivity
Challenge:
significant investments
Extra equipment for over-provisioning
Expense & difficulty in obtaining rights-of-way for diversitySlide69
11/16/2010
69
REIN Overview
RE
liability as an
IN
terdomain Service
Objective
Increase the redundancy available to an IP network at low cost
Basic idea
Observation: IP networks overlap, yet they differ
IP networks provide redundancy for each other
Effects: Sharing improves reliability and reduces costs
Analogy: insurance, airline allianceSlide70
11/16/2010
70
Example: Jan. 9, 2006 of a Major US ISP
Stockton
Rialto
El Palso
Oroville
Los Angles
Dallas
AT&T
Interdomain bypass pathSlide71
11/16/2010
71
How to Make REIN Work: the Details
Why would IP networks share interdomain bypass paths?
Peering, cost-free, or customer-provider
What is the
signaling protocol
to share these paths?
Manual, new protocol, BGP communities
How can an interdomain bypass path be used in the
intradomain forwarding
path?
Interdomain GMPLS
IP tunnelsSlide72
11/16/2010
72
Evaluation
Dataset
US-ISP
hourly PoP-level TMs for a tier-1 ISP (1 month in 2007)
Abilene
5-min router-level TMs on Abilene (6 months: Mar
–
Sep. 2004)
RocketFuel PoP-level topologies
TE algorithms
TE-R (
robust) [This is COPE with REIN awareness]Oblivious routing/bypassing (
oblivious)Constrained Shortest Path First rerouting (CSPF)Flow-based optimal routing (optimal)Slide73
11/16/2010
73
Why REIN: Connectivity Improvements
Without REIN, as high as 60% of links w/ conn. < 3, esp. in some smaller networks
With <= 7 REIN routes: reduce to < 10%Slide74
11/16/2010
74
Why REIN: Overload Prevention
(Abilene 2-link)
Without REIN, even optimal routing overloads bottleneck links by ~300%.
With 10 interdomain bypass paths of 2Gbps each, REIN reduces MLU to ~80%
Abilene bottleneck link traffic intensity: 2-link failures, Tuesday, August 31, 2004 Slide75
11/16/2010
75
REIN Summary
An interdomain service to improve the redundancy of IP networks at low cost
Significantly improves network reliability, esp. when used with COPE to utilize network resources under failuresSlide76
11/16/2010
76
Outline
Challenges to TE in dynamic environment
COPE/R3
REIN
From algorithms to a toolkitSlide77
11/16/2010
77
Torte
Compute TE routing
Convert Link-based to path-based
Generate router configuration
- Network topology
- Traffic demand matrices
- Penalty envelopes
- Protected link sets
f
Paths with
Traffic
Split ratio
- Global MPLS configuration
- Explicit path configuration
- LSP configuration
- Backup LSP/tunnel
configuration
- Output Juniper/Cisco
configurations
Torte
: A Toolkit for Optimal & Robust TE
http://
www-net.cs.yale.edu/projects/torte/tools/
11/16/2010
77Slide78
11/16/2010
78
Link-based vs. Path-based Routing
Our algorithms compute
link-based
traffic split ratio (for each O-D pair)
e.g
. f
newy
losa
{chic, hous}=0.25
means
link chichous carry 25% traffic from newy to losa.Current MPLS-enabled routers support only path-based traffic split ratioTraffic load on equal-cost LSPs is (typically) proportional to requested bandwidth by each LSPJuniper J-, M-, T- seriesCisco 7200, 7500, 12000 series
11/16/201078Slide79
11/16/2010
79
From Link-based to Path-based
Use the
coverage-based
method developed at Yale LANS to convert link-based routing to path-based routing [Zheng et. al, 2007]
A flow decomposition algorithm to select path one by one until reaching the required coverage (for the O-D pair)
11/16/2010
79Slide80
11/16/2010
80
An Example
11/16/2010
80Slide81
11/16/2010
81
Performance Bound of Path Generation
Theorem
: Given a link-based routing
f
and a
q-coverage
path set for
f
, there is a path-based routing over the q-coverage path set s.t. for any input, the MLU under the path-based routing is bounded by
1/q
of the MLU achieved by
f.Slide82
11/16/2010
82
11/16/2010
82
Summary of Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Resilient Traffic Engineering
Interdomain
Reliability
Service
Resilient Routing Reconfiguration
Traffic VariationsSlide83
11/16/2010
83
Future Directions
Problem of pure MPLS TE
May need to configure many LSPs
80 core routers
~ avg. 5 LSPs per O-D pair
Total of 80 x 80 x 5 = 32000 LSPs!
Configuration overhead
Router memory requirement
Possible solutions
Pick heavy-hitter O-D pairs
Others will just use OSPF default routing
Hybrid OSPF/MPLS TE
How to optimize OSPF other than using heuristicsSlide84
11/16/2010
84
Future Directions (cont’d)
Online adaptation with penalty envelope
Network and/or traffic evolve over time
The “normal network load” may change dramatically in a medium time scale
Diurnal pattern
Weekly pattern
Possible solutions
Gradually adapts to changes in network/traffic condition
Difficulty: maintain robustness during the course of adaptationSlide85
11/16/2010
85
Thank you!Slide86
11/16/2010
86
11/16/2010
86
Summary of Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Common-case Optimization with Penalty Envelope [SIGCOMM’06]
Interdomain Reliability Service [SIGCOMM’07]
Resilient Routing Reconfiguration
Traffic VariationsSlide87
11/16/2010
87
11/16/2010
87
Summary of Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network Infrastructure
Common-case Optimization with Penalty Envelope
Interdomain Reliability Service
Resilient Routing Reconfiguration
Traffic VariationsSlide88
Backup Slides
11/16/2010
88Slide89
11/16/2010
89
Outline
Challenges to TE in dynamic environment
Common-case Optimization with Penalty Envelope
Use traffic variation as target application
Interdomain Reliability Service
ImplementationSlide90
11/16/2010
90
Routing Performance Metrics
Maximum Link Utilization (MLU):
Optimal Utilization
Performance RatioSlide91
11/16/2010
91
COPE Instantiation
C:
convex hull of multiple past TMs
A linear predictor predicts the next TM as a convex combination of past TMs (e.g., EWMA)
Aggregation of all possible linear predictors
the convex hull
X:
all possible non-negative TMs
Can add access capacity constraints or use a bigger convex hull
P
C
(f,d): penalty function for common casesmaximum link utilization: U(f,d)performance ratio: PR(f,d)PX(f,x): penalty function for worst casesmaximum link utilization: U(f,d)performance ratio: PR(f,x)minf maxdC PC(f, d)s.t. (1) f is a routing; and (2) xX:PX(f, x) PESlide92
11/16/2010
92
Choosing the Penalty Envelope
PE =
min
f
max
x
X
P
X
(f,x)
1 controls the size of PE w.r.t. the optimal worst-case penalty =1 oblivious routing= prediction-based TESlide93
11/16/2010
93
COPE Implementation
Collect TMs continuously
Compute COPE routing for the next day by solving a linear program (LP)
Common-case optimization
Common case: convex hull of multiple past TMs
All TMs in previous day + same/previous days in last week
Minimize either MLU or PR over the convex hull
Penalty envelope
Bounded PR over all possible nonnegative TMs
See dissertation for details of our LP formulation
Install COPE routing
Currently done once per day
an off-line solutionSlide94
11/16/2010
94
US-ISP: Maximum Link Utilization
Common cases: COPE is close to dynamic and much better than others
Unexpected cases: COPE beats even oblivious and is much better than othersSlide95
11/16/2010
95
Abilene: MLU in Common Cases
Common cases: COPE is close to optimal/dynamic and much better than othersSlide96
11/16/2010
96
COPE with Interdomain Routing
Motivation
Changes in availability of interdomain routes can cause significant shifts of traffic within the domain
E.g. when a peering link fails, all traffic through that link is rerouted
Challenges
Point-to-multipoint demands
need to find splitting ratios among exit points
The set of exit points may change
topology itself is dynamicToo many prefixes cannot enumerate all possible exit point changesSlide97
11/16/2010
97
COPE with Interdomain Routing:
A Two-Step Approach
Apply COPE on an extended topology to derive good splitting ratios
Group dest prefixes with same set of exit points into a
virtual node
Derive
pseudo demands
destined to each virtual node by merging demands to prefixes that belong to this virtual node
Connect virtual node to corresponding peer using
virtual link
with infinite BW
Compute extended topology G
’ as G’ = intradomain topology + peers + peering links + virtual nodes + virtual linksApply COPE to compute routing on G’ for the pseudo demandsDerive splitting ratios based on the routesApply COPE on point-to-point demands to compute intradomain routingUse the splitting ratios obtained in Step 1 to map point-to-multipoint demands into point-to-point demands
intradomain topology
peer
peer
peer
peer
virtual
virtualSlide98
11/16/2010
98
Preliminary Evaluation
COPE can significantly limit the impact of peering link failuresSlide99
11/16/2010
99
Outline
Challenges to TE in dynamic environment
COPE
Interdomain Reliability Service
Use failures as target application
ImplementationSlide100
11/16/2010
100
To Improve Reliability, We Need
Network redundancy
Over-provision of bandwidth
Diversity of physical connectivity
Challenge:
significant investments
Extra equipment for over-provisioning
Expense & difficulty to obtain rights of way for diversity
Our solution: REIN
Also, good traffic engineering algorithms for reliability
Challenge:
scalable, efficient and fast response to topology changes [Resilient Routing Reconf.]Slide101
11/16/2010
101
REIN Business Model: Three Possibilities
Peering
Mutual backup w/o financial settlement
Incentive: improve reliability of both at low cost
Symmetry in backup paths provisioning & usage
Cost-free
One-sided, volunteer and/or public service
Customer-Provider
Fixed or usage-based pricing
Pricing should limit abuseSlide102
11/16/2010
102
Interdomain Bypass Path Signaling
Many possibilities, e.g.,
Manual configuration
A new protocol
Utilize BGP communitiesSlide103
11/16/2010
103
REIN Data Forwarding
Main capability needed: Allow traffic to leave and re-enter a network
not supported under hierarchical routing of the current Internet
REIN forwarding mechanism
Interdomain GMPLS
IP tunneling
Either way, only need agreement b/w neighboring networks
Incrementally deployableSlide104
11/16/2010
104
a1 / A / a1 / REIN_PATH_REQ
BGP Bypass Path Signaling
B provides interdomain bypass paths to A.
Task of A:
discover a path to a1 through B
BGP announcement: Dest. / AS path / Bypass path / Tag
Additional attr.: desired starting point (e.g. a2), bw, etc.
a1
a2
a3
Network A
b1 RIB
a1 / A / a1 / REIN_PATH_REQ
b1
b3
b2
Network B
REIN local policy computes bypass paths
to export: e.g., lightly-loaded paths
a1 / BA / b2,b1,a1 / REIN_PATH_AVAIL
a2 RIB
a1 / BA / b2,b1,a1 /-Slide105
11/16/2010
105
Further Optimization
After an IP network imports a set of such paths, how does it effectively utilize them in
routing computation
?
How to
minimize
the number of such paths?Slide106
11/16/2010
106
Further Optimization: Minimize Interdomain Bypass Paths
Motivation
REIN may provide many alternatives
Only a few may be necessary
Reduce configuration overhead & budget constraints
Step 1: Connectivity objective
Preset connectivity requirement
Cost assoc. w/ interdomain paths
Meet connectivity requirement + minimizing total cost
Formulated as a Mixed Integer Programming (MIP)
Step 2: TE objective
Sort interdomain paths according to a scoring functionGreedy selection until TE has desired performanceSlide107
11/16/2010
107
Traffic Engineering for Reliability (TE-R)
Objectives
Efficient utilization of all redundant resources
Scalable implementable in current Internet
Protection: fast ReRouting for common failure scenarios
Restoration: routing convergence for non-common failure scenarios
VPN QoS guarantee, if possible
a1
a2
a3
Intradomain link
REIN virtual link
Network topology for TE-RSlide108
11/16/2010
108
Our TE-R Algorithm: Features
Robust normal-case routing
f*
Based on COPE [ Wang et al.
’
06 ]
Guarantee bandwidth provisioning for hose-model VPN under f*
Robust fast rerouting under failures on top of
f*
VPN traffic purely intradomain if possible
Novel coverage-based techniques for computational feasibility and implementability
Use flow-based routing to compute optimal solution
Coverage to generate implementation with performance guaranteeFor details, please see dissertation.Slide109
11/16/2010
109
Why Need a TE-R
(
Abilene 1-link failure
)
CSPF overloads bottleneck link by ~300%
vs.
robust TE-R successfully reroutes all traffic
Abilene bottleneck link traffic intensity: 1-link failures, Tuesday August 31, 2004Slide110
11/16/2010
110
Existing Reliability Techniques
Network redundancy techniques
Link layer techniques, e.g. SONET rings
Pro: fast response; Con: expensive
[ Giroire et al. ’03 ]
IP Restoration
Online: MATE
[ Elwalid et al. ’01 ]
& TeXCP
[ Kandula et al. ’05 ]
Offline: Optimization of IGP weights [ Fortz & Thorup ’03, Nucci et al. ’03 ]
Pro: inexpensive; Con: slow responseMPLS Protection [RFC 3469]Path protection: end-to-endLink protection: fast rerouting (FRR)Supported by modern routers, fast response & affordable costsAll techniques depend on available network redundancyTE for reliabilityRestorable bandwidth guaranteed connections [ Kar et al. ’03, Kodialam et al. ’04, Kodialam & Lakshman ’03 ]Oblivious fast rerouting [ Applegate et al. ’04 ]Overlay for reliabilityRON [ Anderson et al. ’01 ],application-level source routing [ Gummadi et al. ’04 ]Pro: Do not require cooperation from backboneCon: Slower response time (dep. On transport layer timeouts), less visibility into networkSlide111
11/16/2010
111
REIN Path Advertisement MessageSlide112
11/16/2010
112
Outline
Challenges to TE in dynamic environment
TE with dynamic traffic
TE with dynamic topology
REIN
R3
Implementation of proposed TE algorithmsSlide113
11/16/2010
113
R3 Motivation
Network topologies are constantly changing
Unexpected failures
Scheduled maintenance
Multiple changes may overlap in time
Changes happen frequently
Some changes take a long time to recover
Fiber cuts
Maintenance
An IP network usually operates with
multiple (simultaneous) topology changesSlide114
11/16/2010
114
Existing TE Approaches
IP fast re-routing
IPFRR
Multi-topology OSPF
Failure-carrying Packets (FCP)
Pros:
Fast response
Scalable: e.g. FCP works as long as a network remain connected
Cons:
Only guarantee connectivity
No guarantee of quality of response, esp. congestionSlide115
11/16/2010
115
Existing TE Approaches (cont
’
)
MPLS fast re-routing
Optimal demand-oblivious restoration [Applegate et al, ’04]
COPE reliability routing [REIN TE-R]
Pros:
Fast response
Efficiency: provide some guarantee for quality of response
Cons:
Re-optimize routing for each change scenario
No scalableOn-demand computation: slow response time
Pre-computation: need to keep many protection routings in routersDisruption to even existing traffic not affected by changesSlide116
11/16/2010
116
Our Approach: R3
Objectives
Fast response + little disruption to existing traffic
Guarantee for quality of response
Scalability (not too many protection routings kept in routers)
Our approach
R3 = Resilient Routing Reconfiguration
A single protection routing reconfigured according to current topology (initial + changes)
Pre-computed protection + simple reconfiguration
Performance bound on congestion
A single protection routing works for many change scenariosSlide117
11/16/2010
117
R3 Basic Idea
Consider link (i, j)
If (i,j) fails, at most 10 Mbps traffic needs re-routing
If network can route 10 Mbps
additional
,
fictional
traffic from i to j, failure recovery is guaranteed
As least as much traffic can be carried w/o using (i,j)
For proof, see dissertation
Cap = 10 Mbps
Demand = 6 Mbps
i
j
Addl. 4 Mbps
Demand = 6 Mbps
i
j
Addl. 6MbpsSlide118
11/16/2010
118
R3: Topology-uncertainty Demand
Topology uncertainty => traffic uncertainty
Link capacities as upper bound of traffic to re-route
Other choices:
a given percent of link capacities
Exponential number of topology changes => convex set of topology-uncertainty demands
Integer relaxation
Change s1 ~ topology uncertainty demand x1
Change s2 ~ topology uncertainty demand x2
Change s1 or Change s2 ~ x1 or x2
x1+ (1- ) x2,
[0,1]Slide119
11/16/2010
119
R3: Routing Reconfiguration
Link e = (i, j)
Let g
e
be the routing of topology-uncertainty demand from i to j
The protection routing when e fails is a
re-normalization
of g
e
after removing e
ge(e) = 0 (no traffic goes on e)ge(e’) = ge(e’) / [1 – ge(e) ] (scale up to a unit flow)Multiple link failures: reconfigure one-by-oneFinal result is independent of order of reconf.Slide120
11/16/2010
120
Evaluation Methodology
Dataset
US-ISP
hourly PoP-level TMs for a tier-1 ISP (1 month in 2007)
RocketFuel
PoP-level gravity model TMs
Topology changes
All single- and two-link changes
Sampled three- and four-link changes
For US-ISP, single-link + single-maintenance
TE algorithms
Base routing:
R3OSPFProtection routingR3OSPF reconvergence, OSPF link detour, FCPFlow-based optimal routing (optimal) used as referenceSlide121
11/16/2010
121
US-ISP: Worst-case with one change
R3 and OSPF+R3 achieves close-to-optimal performance
All others lead to much higher level of traffic intensitySlide122
11/16/2010
122
US-ISP: One Change Performance Ratio
R3 and OSPF+R3 consistently performs within 30% of optimal
All others lead to much higher performance penalty (>= 260%)Slide123
11/16/2010
123
US-ISP: Two and Three Change
R3 and OSPF+R3 significantly out-performs other algorithmsSlide124
11/16/2010
124
RocketFuel SBC: Two and Three Change
R3 significantly out-performs other algorithms, even OSPF+R3
For SBC, integerated R3 base + protection is necessarySlide125
11/16/2010
125
RocketFuel Level-3: Two and Three Change
R3 and OSPF+R3 have similar performance, and significantly out-performs other algorithms
For Level-3, a good OSPF + R3 is enoughSlide126
11/16/2010
126
R3 Summary
R3
A single protection routing that can be reconfigured to recover from multiple topology changes
Simple reconfiguration upon changes
Performance guarantee
A single protection routing works for many possible changes
Ongoing & future work
Implementation of routing reconfiguration on routers
Preventing transient loops during simultaneous reconfigurationsSlide127
11/16/2010
127
Generate MPLS configurations
Configure LSPs for each O-D pair
Create explicit LSP path
Requested bandwidth of the LSP is proportional to the (path-based) traffic split ratio
Configure backup LSPs/tunnels for each protected link (to implement R3)
Create explicit backup LSP/tunnel
Requested bandwidth of the backup LSP/tunnel is proportional to the backup traffic split ratio
11/16/2010
127Slide128
11/16/2010
128
Future Directions on COPE
COPE with OSPF
COPE with online TE
COPE for other network optimization problemsSlide129
11/16/2010
129
Future Directions on REIN
A thorough study of the effects of cross-provider shared-risk link group data
Effectiveness of REIN on smaller IP networks
Improving TE robustness under dynamic topologySlide130
11/16/2010
130
Our Approaches
REIN
Cost-effective way to increase redundancy
R3
Scalable, congestion-free fast reroutingSlide131
11/16/2010
131
Summary of Contributions
Traffic Variations
Topology Changes
ISP Objectives
Network infrastructure
Traditional TE approaches
Network infrastructure
Interdomain
Reliability
Service
Common-case Optimization with Penalty Envelope
Resilient Routing Service
Resilient Routing Reconfiguration
Traffic VariationsSlide132
11/16/2010
132
What
’
s next
Traffic Variations
Topology Changes
ISP Objectives
Network infrastructure
Interdomain
Reliability
Service
Common-case Optimization with Penalty Envelope
Resilient Routing Service
Resilient Routing Reconfiguration
Traffic VariationsSlide133
11/16/2010
133
What
’
s next
Traffic Variations
Topology Changes
ISP Objectives
Network infrastructure
Interdomain
Reliability
Service
Common-case Optimization with Penalty Envelope
Resilient Routing Service
Resilient Routing Reconfiguration
Traffic VariationsSlide134
11/16/2010
134
Summary
Traffic Variations
Topology Changes
ISP Objectives
Network infrastructure
Interdomain
Reliability
Service
Common-case Optimization with Penalty Envelope
Resilient Routing Service
Resilient Routing Reconfiguration
Traffic VariationsSlide135
11/16/2010
135
Why REIN: Overload Prevention (
US-ISP failure log
)
REIN can reduce normalized traffic intensity by 118% ~ 35%, depending on the TE-R algorithms used.
Improvement of traffic intensity by REIN for a week in January 2007 for US-ISPSlide136
136
R3 Backup
Slides
136Slide137
137
US-ISP Single Failure: Zoom In
R3 achieves near-optimal performance (R3 vs opt)
For each hour (in one day), compare the worst case failure scenario for each algorithm
137Slide138
Level-3 Multiple Failures
Left: all two-failure scenarios; Right: sampled three-failure scenarios
138
R3 out-performs other schemes by >50%Slide139
SBC Multiple Failures
Left: all two-failure scenariosRight: sampled three-failure scenarios139
“Ideal” R3 outperforms OSPF+R3 in some casesSlide140
Robustness on Base Routing
OSPFInvCap: link weight is inverse proportional to bandwidthOSPF: optimized link weightsLeft: single failure; Right: two failures140
A better base routing can lead to better routing protection Slide141
141
Flow RTT (Denver-Los Angeles)
R3 implementation achieves smooth routing protection
141Slide142
142
R3: OD pair throughput
Traffic demand is carried by R3
under multiple link failure scenarios
142Slide143
143
R3: Link Utilization
Bottleneck link load is controlled under 0.37 using R3
143Slide144
Offline Precomputation Solution
[C2] contains infinite number of constraints due to xConsider maximum “extra” load on link e caused by x
Σl∈Exlpl(e) ≤ UB, if there exists multipliers πe(l) and λe (LP duality):Convert [C2] to polynomial number of constraints: 144
π
e
(l)
λ
e
& constraints on
π
e
(l)
and λeSlide145
145
Traffic Priority
Service priority is a practical requirement of routing protection
Traffic Priority Example
TPRT (real-time IP) traffic
should be congestion-free under up-to-3 link failures (to achieve 99.999% reliability SLA)
Private Transport (TPP) traffic
should survive up-to-2 link failures
General IP traffic
should only survive single link failures
145Slide146
146
R3 with Traffic Priority
Attribute
protection level
to each class of traffic
TPRT protection level 3
TPP protection level 2
IP protection level 1
Traffic with protection level greater than equal to i should survive under failure scenarios covered by protection level i
146Slide147
147
R3 with Traffic Priority Algorithm
Precomputation with Traffic Priority
Consider protection for each protection level i
Guarantee each class of traffic has no congestion under the failure scenarios covered by its protection level
147Slide148
148
R3 with Traffic Priority Simulation
Routing Protection Simulation
Basic R3
vs
R3 with Traffic Priority
Methodology
US-ISP: a large tier-1 operational network topology
hourly PoP-level TMs for a tier-1 ISP (1 week in 2007)
extract IPFR and PNT traffic from traffic traces, subtract IPFR and PNT from total traffic and treat the remaining as general IP
Protection levels:
TPRT: up-to-4 link failures
TPP: up-to-2 link failures
IP: single link failures
Failure scenarios:
enumerated all single link failures
100 worst cases of 2 link failures
100 worst cases of 4 link failures
148Slide149
Traffic Protection Priority
IP: up-to-1 failure protection;TPP: up-to-2 failure protection;TPRT: up-to-4 failure protectionLeft: single failure;Righttop: worst two failures;Rightbottom: worst four failures
149
R3 respects different traffic protection prioritiesSlide150
Linux Implementation
MPLS-ff software routerBased on Linux kernel 2.6.25 + Linux MPLSImplement flow-based routing for efficient R3 online reconfigurationExtend MPLS FWD (forward) data structure to enable per hop traffic splitting for each flowFailure detection and responseDetection using Ethernet monitoring in kernelIn operational networks, detection can be conservative (for SLRG, MLG)Notification using ICMP 42 floodingRequires reachability under failuresRerouting traffic by MPLS-ff label stacking
Online reconfiguration of traffic splitting ratios (locally at each router)150Slide151
151
MPLS-ff
R3 uses
flow-based
traffic splitting (for each OD pair)
e.g
. p
newy
wash
(chic
hous)=0.2
means
link chic
hous will carry 20% traffic originally carried by newy
wash when newy
wash fails
Current MPLS routers only support
path-based
traffic splitting
Traffic load on equal-cost LSPs is proportional to requested bandwidth by each LSP
Juniper J-, M-, T- series and Cisco 7200, 7500, 12000 series
e.g. multiple paths from newy to wash can be setup to protect link newy
wash
151
151Slide152
152
MPLS-ff
Convert flow-based to path-based routing
e.g., using flow decomposition algorithm
-- [Zheng et al]
Expensive to implement R3 online reconfiguration
Need to recompute and signal LSPs after each failure
Extend MPLS to support flow-based routing
MPLS-ff
Enable next-hop traffic splitting ratios for each flow
flow: traffic originally carried by a protected link
152
152Slide153
153
MPLS-ff
MPLS FWD data structure
Extended to support
multiple
Next Hop Label Forwarding Entry (NHLFEs)
One NHLFE specifying one neighbor
One Next-hop Splitting Ratio for each NHLFE
Label 200 FWD:
NHLFE R2: 50%
NHLFE R3: 50%
NHLFE R4: 0%
R1
R2
R4
R3
153Slide154
154
MPLS-ff
Implement Next-hop Splitting Ratio
router i, next-hop j, protected link (a,b):
Packet hashing
Same hash value for packets in the same TCP flow
Independent hash values on different routers for any particular TCP flow
Hash of (
packet flow fields + router id
)
Hash = f(src, dst,
srcPort, dstPort)
FWD to j if 40<Hash<64
all packets from i: 40<Hash<64!
(skewed hash value distribution)
i
j
154Slide155
155
Failure Response
Failure Detection and Notification
detection: Ethernet monitoring in Linux kernel
notification: ICMP 42 flood
Failure Response
MPLS-ff label stacking
Protection Routing Update
R3 online reconfiguration
requires a local copy of
p
on each router
155Slide156
R3 Design: Routing Model
Network topology as graph G = (V,
E)V: set of routersE: set of directed network links, link e=(i,j) with capacity ceTraffic matrix (TM)TM d is a set of demands: d = { dab | a,b ∈
V
}
d
ab
:
traffic demand from a to
b
Flow-based routing representation
r
= { rab(e) | a,b ∈ V,
e ∈ E }rab(e): the fraction of traffic from a to b (dab) that is carried by ink ee.g. rab(e)=0.25 means link e carry 25% traffic from a to b156Slide157
Intuition behind R3
Plan rerouting on the single original topology
Avoid enumeration of topologies (failure scenarios)Compute (r,p) to gurantee congestion-freefor d + x∈XF on GPuzzlesAdd rerouted traffic before it appearsXF = { x | 0 ≤ xl ≤ cl
,
Σ
(
x
l
/cl) ≤ F
}
superset of rerouted traffic under failures
Use network resource that will disappear
Protection routing
p uses links that will fail
157
By doing two
counter-intuitive
things, R3 achieves:
Congestion-free under multiple link failures w/o enumeration of failure scenarios
Optimal for single link failure
Fast rerouting
Not
real traffic demand!
Topology with links that will
fail
!Slide158
Step 1: Fast rerouting after
ac fails:Precomputed p for link ac:pac
(ac)=0.4, pac(ab)=0.3, pac(ad)=0.3ac fails 0.4 need to be carried by ab and adpac: ac 0, ab 0.5, ad 0.5a locally rescales pac and activates fast reroutingEfficiently implemented by extending MPLS label stacking
2/10
2
+2
/10
Online Reconfiguration
158
0/10
0/10
2/10
0
+2
/10
0
+2
/10
2+
4
/10
4/5
4
-4
/5
Fast rerouting
load
/
capacity
a
b
c
d
0.3
0.3
0.3
0.6
0.4
p
ac
(
e
)
a
b
c
d
0
0.5
0.5
0.5
1