/
Efficient and Robust Traffic Engineering in a Dynamic Environment Efficient and Robust Traffic Engineering in a Dynamic Environment

Efficient and Robust Traffic Engineering in a Dynamic Environment - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
362 views
Uploaded On 2018-09-22

Efficient and Robust Traffic Engineering in a Dynamic Environment - PPT Presentation

PhD Dissertation Defense Hao Wang Y Richard Yang Joan Feigenbaum Jennifer Rexford Princeton Avi Silberschatz Advisor Committee Efficient and Robust Internet Traffic Engineering Y Richard Yang ID: 675792

2010 traffic link routing traffic 2010 routing link failure topology network based failures protection isp cope case path mpls

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Efficient and Robust Traffic Engineering..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Efficient and Robust Traffic Engineering in a Dynamic Environment

Ph.D. Dissertation DefenseHao Wang

Y. Richard YangJoan FeigenbaumJennifer Rexford (Princeton)Avi Silberschatz

Advisor:

Committee:Slide2

Efficient and Robust

Internet Traffic EngineeringY. Richard Yang

Nov. 2010Based on Slides of Hao Wang’s Defense and R3 SIGCOMM PresentationsSlide3

11/16/2010

3

Collaborators (2003

2010)

Richard

Alimi

Alex Gerber (AT&T Research)

Albert Greenberg (Microsoft Research)

Paul H. Liu

Zheng

Ma

Lili

Qiu

(UT Austin)

Jia

Wang (AT&T Research)

Ye Wang

Haiyong

Xie

Yin

Zhang (UT Austin)Slide4

11/16/2010

4

Internet as a Social Infrastructure

Growing fast

IPTV

2007-2011

:

Increase

10

times

General IP traffic trend

Applications with more stringent requirements, e.g.,

IPTV / VoDVoIP, telecommuting/video-conferencing

 200920102011201220132014CAGRConsumer11,60216,53423,75032,54543,11755,80137%Business3,0833,8624,7405,6976,8018,10321%

Source: Cisco VNI 2010. Unit: PB/monthSlide5

11/16/2010

5

Internet Routing

Determines paths from source to destination

For applications

Delay, loss, delay jitter, …

For ISPs

Efficiency of resource usage

Capability of failure recovery

Scalability

Routing is becoming more important

High cost of network assets

Highly competitive nature of the ISP marketSlide6

11/16/2010

6

Traffic Engineering (TE)

Objective:

efficient & reliable routing

Input: Topology

Traffic

ISP Objective

App. Requirement

A

D

B

C

F

E

0.8

0.2

Output: RoutingSlide7

11/16/2010

7

Challenges in a Dynamic Environment

Traffic fluctuates

Topology changes

Multiple ISP objectives

Traffic

Engineering

Traffic

Topology

ISP Objective

App. Requirement

RoutingSlide8

11/16/2010

8

Challenge due to Traffic Dynamics

Traffic fluctuates, e.g.,

Diurnal patterns

Worms/viruses, DoS attacks, flash crowds

BGP routing changes, load balancing by multihomed customers, TE by peers, failures in other networks

Implications: c

an lead to long delay, high loss, reduced throughputSlide9

11/16/2010

9

Challenge due to Topology Dynamics

Topology changes, e.g.,

Maintenance, failures,

mis

-configurations

Accidents / disasters

e.g., 675,000 excavation accidents in per year [Common Ground Alliance] Network cable cuts every few days …

Implications: substantial disruption to Internet

E.g., two link failures in Sprint led to disconnection of millions of wireless users, partition of many office networks [Sprint]Slide10

11/16/2010

10

Challenge due to ISP Objectives

Network traffic & topology relatively stable most of the time

Unexpected scenarios do happen

Many unexpected scenarios happen when service is most valuable!

disaster, flash crowds,…

unhandled dynamics =>

violation of SLA

customers can remember bad experiences very well

How to balance between:

Common-case performance

Unexpected-case performanceSlide11

11/16/2010

11

11/16/2010

11

Summary of

Project Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Traffic EngineeringSlide12

11/16/2010

12

11/16/2010

12

Summary of Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Resilient Traffic Engineering Service

Interdomain Reliability ServiceSlide13

11/16/2010

13

11/16/2010

13

Summary of Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Resilient Traffic Engineering Service

Interdomain

Reliability

Service

Resilient Routing Reconfiguration

Traffic VariationsSlide14

11/16/2010

14

11/16/2010

14

What

s Next

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Resilient Traffic Engineering Service

Interdomain Reliability Service

Resilient Routing Reconfiguration

Traffic VariationsSlide15

11/16/2010

15

Challenge: Unpredictable Traffic

Internet traffic is

highly unpredictable!

Can be relatively stable most of the time

However, usually contains spikes that ramp up extremely quickly

we saw traffic spikes in the traces of several networks

Unpredictable traffic variations have been observed and studied by other researchers

[Teixeira et al. ’04, Uhlig & Bonaventure ’02, Xu et al. ’05 ]

Confirmed by operators of several large networks via email surveySlide16

11/16/2010

16

Previous TE Approaches

Prediction-based TE

Examples:

Off-line:

Single predicted TM

[ Sharad et al. ’05 ]

Multiple predicted TMs

[ Zhang et al. ’05 ]

On-line: MATE

[ Elwalid et al. ’01 ]

& TeXCP [ Kandula et al. ’05 ] Pro: Works great when traffic is predictable

Con: May pay a high penalty when real traffic deviates substantially from the predictionSlide17

11/16/2010

17

Previous TE Approaches (cont’d)

Traffic-oblivious routing

Examples:

Oblivious routing

[ Racke ’02, Azar et al. ’03, Applegate et al. ’03 ]

Valiant load-balancing

[ Kodialam et al. ’05, Zhang & McKeown ’04 ]

Pro:

Provides worst-case performance bounds

Con:

May be sub-optimal for normal traffic

The optimal oblivious ratio of several real network topologies studied in [Applegate et al ’03] is ~2An over-provision rate of ~2 is still too highSlide18

11/16/2010

18

Our Approach: COPE

Objectives

Good common-case performance

Tight worst-case guarantee

Our approach

Common-case optimization

Optimize for predicted, normal traffic load

Penalty envelope

Bound worst-case performance for the unexpectedSlide19

11/16/2010

19

COPE Illustrated

C

ommon-case

O

ptimization with

P

enalty

E

nvelope

C

X

Bound

for set X

Optimize

for set C

min

f

max

d

C

P

C

(f, d)

s.t. (1) f is a routing

(2)

xX:

P

X

(f, x)

PE

C:

common-case (predicted) TMs

X:

all TMs of interest

P

C

(f,d):

common-case penalty function

P

X

(f,x):

worst-case penalty function

PE:

penalty envelopeSlide20

11/16/2010

20

COPE in Perspective

Prediction-based TE

Best common-case

+

Poor worst-case

Oblivious Routing

Poor common-case

+

Best worst-case

COPE

Good common-case

+

Bounded worst-case

Spectrum of TE with unpredictable traffic

Position controllable by penalty envelope

The worst unexpected case too unlikely to occur

Too wasteful to

optimize

for

the worst-case (at the cost of poor common-case performance)

There are enough unexpected cases

Penalty envelope is requiredSlide21

11/16/2010

21

Model: More Details

Network topology: graph

G = (V,E)

V

: set of routers

E

: set of network links, link (i,j) capacity c

ij

Traffic matrices (TMs)

A TM is a set of demands:

d = { d

ab | a,b  V }

dab: traffic demand from a to bLink-based routingf = { fab(i,j) | a,b  V, (i,j)  E }fab(i,j) : the fraction of demand from a to b (i.e., dab) that is routed through link (i,j) Slide22

11/16/2010

22

Routing Performance Metric

Maximum Link Utilization

(MLU): Slide23

11/16/2010

23

Penalty Envelope

A Case Study

X: the set of TMs with access capacity constraints

P

X

(f, x): MLU of routing f on demand x

PE: upper-bound on MLU

Direct formulation of

xX:

P

X(f, x)  PE:

infinite # of casesSlide24

11/16/2010

24

Slave LP

Test whether constraint satisfied by solving the following LP for each link:

Constraint satisfied if slave LP objective r

*

≤ PE

Slave LP can serve as separation oracle in Ellipsoid Method

still not fast enough

Can we use Interior-Point Method?Slide25

11/16/2010

25

Dual of Slave LP

Introduce dual variables for the access capacity constraints:

Dual function of the slave LP:Slide26

11/16/2010

26

Dual of Slave LP

Since no duality gap, r

*

≤ PE iff

Polynomial # of constraints, can use Interior-Point MethodSlide27

11/16/2010

27

Evaluations

Evaluated algorithms

COPE:

COPE with

P

C

(f,d) = PR(f,d)

(i.e. performance ratio)

COPE-MLU:

COPE with

P

C(f,d) = U(f,d) (i.e. max link utilization)Oblivious routing:

minf maxxPR(f,x) ( COPE with =1)Dynamic: optimize routing for TM in previous intervalPeak: peak interval of previous day + prev/same days in last weekMulti: all intervals in previous day + prev/same days in last weekOptimal: requires an oracleDatasetUS-ISPhourly PoP-level TMs for a tier-1 ISP (1 month in 2005)Optimal oblivious ratio: 2.045; default penalty envelope: 2.5Abilene5-min router-level TMs on Abilene (6 months: Mar – Sep. 2004)Optimal oblivious ratio: 1.853; default penalty envelope: 2.0Slide28

11/16/2010

28

US-ISP: Performance Ratio

Common cases: COPE is close to dynamic and much better than others

Unexpected cases: COPE beats even oblivious and is much better than others

Performance Ratio =

MLU of the algorithm

-------------------------------

MLU of optimal routingSlide29

11/16/2010

29

Abilene: Performance Ratio

Common cases: COPE is close to dynamic and much better than others

Unexpected cases: COPE is close to oblivious and much better than othersSlide30

11/16/2010

30

Abilene: MLU in Unexpected Cases

Unexpected cases: COPE is close to oblivious and much better than othersSlide31

11/16/2010

31

US-ISP: Sensitivity to PE

Even a small margin in PE can significantly improve the common-case performanceSlide32

11/16/2010

32

COPE Summary

COPE

=

C

ommon-case

O

ptimization with

P

enalty

E

nvelopeCOPE works!Common cases: close to optimal; much better than oblivious routing and prediction-based TE with comparable overheadUnexpected cases: much better than prediction-based TE, and sometimes may beat oblivious routingEven a small margin in PE improves common-case performance a lot

COPE as an optimization paradigm also applies to many other contexts, e.g., interdomainSlide33

Remaining Issues: Topology Dynamics

COPE handles traffic variations, not topology variations, but failures are common in operational IP networks Accidents, attacks, hardware failure, misconfig, maintenanceMultiple unexpected failures may overlape.g., concurrent fiber cuts in Sprint (2006)

Planned maintenance affects multiple network elementsMay overlap with unexpected failures (e.g. due to inaccurate SRLG)Increasingly stringent requirement on reliabilityVoIP, video conferencing, gaming, mission-critical apps, etc. SLA has teeth  violation directly affects ISP revenueNeed resiliency: network should recover quickly & smoothly from one or multiple overlapping failuresSlide34

Challenge: Topology Uncertainty

Number of failure scenarios quickly explodes500-link network, 3-link failures: > 20,000,000!

Difficult to optimize routing to avoid congestion under all possible failure scenariosBrute-force failure enumeration is clearly infeasibleExisting methods handle only 100s of topologiesDifficult to install fast routingPreconfigure 20,000,000 backup routes on routers?Slide35

Focus exclusively on

reachabilitye.g., FRR, FCP (Failure Carrying Packets), Path SplicingMay suffer from congestion and unpredictable performance

congestion mostly caused by rerouting under failures [Iyer et al.]multiple network element failures have domino effect on FRR rerouting, resulting in network instability [N. So & H. Huang]Only consider a small subset of failurese.g., single-link failures [D. Applegate et al.]Insufficient for demanding SLAsOnline routing re-optimization after failuresToo slow  cannot support fast reroutingExisting Approaches & Limitations35Slide36

R3:

Resilient Routing Reconfiguration

A novel link-based routing protection schemerequiring no enumeration of failure scenariosprovably congestion-free for all up-to-F link failuresefficient w.r.t. router processing/storage overheadflexible in supporting diverse practical requirementsSlide37

Goal: congestion-free rerouting under up-to-F link failures

Input: topology G(V,

E), link capacity ce, traffic demand ddab: traffic demand from router a to router bOutput: base routing r, protection routing prab(e): fraction of dab carried by link ep

l

(

e

)

:

(link-based) fast rerouting for link l

Problem Formulation

37

5

10

10

10

10

capacity

d

ab

=6

a

b

d

c

r

ab

(

ab

)=1

p

ab

(

ac

)=1

p

ab

(

cb

)=1Slide38

From Topology Uncertainty

to Traffic UncertaintyInstead of optimizing for original traffic demand

on all possible topologies under failuresR3 optimizes protection routing for a set of traffic demands on the original topologyRerouting virtual demand set captures the effect of failures on amount of rerouted trafficProtection routing on original topology can be easily reconfigured for use after failure occurs38Slide39

Failure scenario (f)

 rerouted traffic (x)

Rerouted traffic under all possible up-to-F-link failure scenarios (independ of r):XF = { x | 0 ≤ xl ≤ cl

,

Σ

(

x

l

/

c

l

) ≤

F

} (convex combination)

Rerouting Virtual Demand Set39

4/5

0/10

0/10

4/10

2/10

rerouted traffic

x

l

after link

l

fails = base load on

l

given

r

(

r

is congestion-free

x

l

c

l

)

load

/

capacity

a

b

c

d

Failure scenario

Rerouted traffic

Upper bound of rerouted traffic

ac

fails

x

ac

=

4

x

ac

5

(

c

ac

)

ab

fails

x

ab

=

2

x

ab

10

(

c

ab

)Slide40

R3 Overview

Offline precomputationPlan (r,p) together for original demand d

plus rerouting virtual demand x on original topology G(V,E) to minimize congestionp may “use” links that will later failOnline reconfigurationConvert and use p for fast rerouting after failures40Slide41

Compute (r,

p) to minimize MLU (Max Link Utilization) for original demand d + rerouting demand x ∈ XF r carries

d, p carries x∈XFmin(r,p) MLUs.t. [1] r is a routing, p is a routing; [2] ∀x∈XF, ∀e: [ ∑a,b∈Vdabr

ab

(

e

)

+

∑l∈

E

x

l

p

l(e) ] / ce ≤ MLUChallenge: [2] has infinite number of constraints

Solution: apply LP duality  a polynomial # of constraintsOffline Precomputation41Original trafficRerouting trafficSlide42

Step 1: Fast rerouting after

ac fails:Precomputed p for ac:

pac(ac)=1/3, pac(ab)=1/3, pac(ad)=1/3ac fails  fast reroute using pac \ pac(ac)equivalent to a rescaled pac:ξac(ac)=0

,

ξ

ac

(

ab

)=1/2, ξac(ad

)=1/2

link

l

fails  ξl(e)=pl(

e)/(1-pl(l))Online Reconfiguration42

1/3

1/3

1/3

2/3

1/3

p

ac

(

e

)

a

b

c

d

0

1/2

1/2

1/2

1

ξ

ac

(

e

)Slide43

Online Reconfiguration (Cont.)

Step 2: Reconfigure p after failure of acCurrent p for

ab: pab(ac)=1/2, pab(ad)=1/2ac fails  1/2 need to be “detoured” using ξac pab(ac)=0, pab(ad)=3/4, pab(ab)=1/4link l fails 

l’

,

p

l’

(e) = pl’

(

e

)+

p

l’(l) ξl(e)

43

1/2

1/2

1/2

1/2

Apply detour

ξ

ac

on every

protection routing (for other

links) that is using

ac

p

ab

(

e

)

a

b

c

d

0

3/4

3/4

0

0

1/4Slide44

R3 Guarantees

Sufficient condition for congestion-freeif ∃(r,p)

s.t. MLU ≤ 1 under d+XF no congestion under any failure involving up to F linksNecessary condition under single link failureif there exists a protection routing guarantees no congestion under any single-link failure scenario  ∃(r,p) s.t. MLU ≤ 1 under d+X

1

Adding superset of rerouted traffic to original demand is not so wasteful

Open problem: is R3 optimal for >1 link failures?

R3 online reconfiguration is

order independent

of multiple

failures

44Slide45

Intuition behind R3

Plan rerouting on the single original topology

Avoid enumeration of topologies (failure scenarios)Compute (r,p) to gurantee congestion-freefor d + x∈XF on GPuzzlesAdd rerouted traffic before it appearsXF = { x | 0 ≤ x

l

c

l

, Σ(x

l

/

c

l

) ≤ F } superset of rerouted traffic under failuresUse network resource that will disappearProtection routing p uses links that will fail

45

By doing two

counter-intuitive

things, R3 achieves:

Congestion-free under multiple link failures w/o enumeration of failure scenarios

Optimal for single link failure

Fast rerouting

Not

real traffic demand!

Topology with links that will

fail

!Slide46

R3 Extensions

Fixed base routingr can be given (e.g., as an outcome of OSPF)Trade-off between no-failure and failure protectionAdd penalty envelope

β(≥1) to bound no-failure performanceTrade-off between MLU and end-to-end delayAdd envelope γ(≥1) to bound end-to-end path delayPrioritized traffic protectionAssociate different protection levels to traffic with different prioritiesRealistic failure scenariosShared Risk Link Group, Maintenance Link GroupTraffic variationsOptimize (r,p) for d ∈ D + x ∈ XF

46Slide47

Evaluation Methodology

Network TopologyReal: Abilene, US-ISP (PoP-level)Rocketfuel: Level-3, SBC, and UUNet

Synthetic: GT-ITMTraffic DemandReal: Abilene, US-ISP Synthetic: gravity modelFailure ScenariosAbilene: all failure scenarios with up to 3 physical linksUS-ISP: maintenance events (6 months) + all 1-link, 2-link failures + ~1100 sampled 3-link failuresEnumeration only needed for evaluation, not for R347Slide48

Evaluation Methodology (cont.)

R3 vs other rerouting schemesOSPF+R3: add R3 rerouting to OSPFMPLS-ff+R3: “ideal” R3 (flow-based base routing)

OSPF+opt : benchmark, optimal rerouting (based on OSPF)OSPF+CSPF-detour: commonly usedOSPF+recon: ideal OSPF reconvergenceFCP: Failure Carrying Packets [Lakshminarayanan et al.]PathSplice: Path Splicing (k=10,a=0,b=3) [Motivala et al.]Performance metricsMLU (Maximum Link Utilization) measures congestionlowerbetterperformance ratio = MLU of the algorithm / MLU of optimal routing for the changed topology (corresponding to the failure scenario)always ≥1, closer to 1better48Slide49

US-ISP Single Failure

49

R3 achieves near-optimal performance (

R3

);

R3 out-performs other schemes significantly.Slide50

US-ISP Multiple Failures

50

R3 consistently out-performs other schemes by at least 50%

all two-link failure scenarios

sampled three-link failure scenariosSlide51

US-ISP No Failure: Penalty Envelope

51

R3 near optimal under no failures with 10% penalty envelopeSlide52

Experiments using Implementation

R3 Linux software router implementationBased on Linux kernel 2.6.25 + Linux MPLSImplement flow-based fast rerouting and efficient R3 online reconfiguration by extending Linux MPLSAbilene topology emulated on Emulab3 physical link failures (6 directed link failures)

52Slide53

R3 vs OSPF-recon: Link Utilization

53

R3 out-performs OSPF+recon by a factor of ~3Slide54

Precomputation Complexity

Profile computation time (in seconds) Computer: 2.33 GHz CPU, 4 GB memory54

Offline precomputation time < 36 minutes

(operational topologies, < 17 minutes)

Computation time stable with higher #failures.Slide55

Linux Implementation

MPLS-ff software routerBased on Linux kernel 2.6.25 + Linux MPLSImplement flow-based routing for efficient R3 online reconfigurationExtend MPLS FWD (forward) data structure to enable per hop traffic splitting for each flowFailure detection and responseDetection using Ethernet monitoring in kernelIn operational networks, detection can be conservative (for SLRG, MLG)

Notification using ICMP 42 floodingRequires reachability under failuresRerouting traffic by MPLS-ff label stackingOnline reconfiguration of traffic splitting ratios (locally at each router)55Slide56

56

MPLS-ff

R3 uses

flow-based

traffic splitting (for each OD pair)

e.g

. p

newy

wash

(chic

hous)=0.2

means

link chic

hous will carry 20% traffic originally carried by newy

wash when newy

wash fails

Current MPLS routers only support

path-based

traffic splitting

Traffic load on equal-cost LSPs is proportional to requested bandwidth by each LSP

Juniper J-, M-, T- series and Cisco 7200, 7500, 12000 series

e.g. multiple paths from newy to wash can be setup to protect link newy

wash

56

56Slide57

57

MPLS-ff

Convert flow-based to path-based routing

e.g., using flow decomposition algorithm

-- [Zheng et al]

Expensive to implement R3 online reconfiguration

Need to recompute and signal LSPs after each failure

Extend MPLS to support flow-based routing

MPLS-ff

Enable next-hop traffic splitting ratios for each flow

flow: traffic originally carried by a protected link

57

57Slide58

58

MPLS-ff

MPLS FWD data structure

Extended to support

multiple

Next Hop Label Forwarding Entry (NHLFEs)

One NHLFE specifying one neighbor

One Next-hop Splitting Ratio for each NHLFE

Label 200 FWD:

NHLFE R2: 50%

NHLFE R3: 50%

NHLFE R4: 0%

R1

R2

R4

R3

58Slide59

59

MPLS-ff

Implement Next-hop Splitting Ratio

router i, next-hop j, protected link (a,b):

Packet hashing

Same hash value for packets in the same TCP flow

Independent hash values on different routers for any particular TCP flow

Hash of (

packet flow fields + router id

)

Hash = f(src, dst,

srcPort, dstPort)

FWD to j if 40<Hash<64

all packets from i: 40<Hash<64!

(skewed hash value distribution)

i

j

59Slide60

60

Failure Response

Failure Detection and Notification

detection: Ethernet monitoring in Linux kernel

notification: ICMP 42 flood

Failure Response

MPLS-ff label stacking

Protection Routing Update

R3 online reconfiguration

requires a local copy of

p

on each router

60Slide61

Router Storage Overhead

Estimate maximum router storage overhead for our implementationILM (MPLS incoming label mapping) and NHLFE (MPLS next-hop label forwarding entry) are required to implement R3 protection routing

61

Modest router storage overhead:

FIB < 300 KB, RIB < 20 MB,

#ILM < 460, #NHLFE < 2,500Slide62

62

R3 Linux Implementation: Flow

RTT (Denver-Los Angeles)

R3 implementation achieves smooth routing protection

62Slide63

63

R3: OD pair throughput

Traffic demand is carried by R3

under multiple link failure scenarios

63Slide64

Summary

R3: Resilient Routing Reconfiguration

Provably congestion-free guarantee under multiple failuresKey idea: convert topology uncertainty to traffic uncertaintyoffline precomputation + online reconfigurationFlexible extensions to support practical requirementsTrace-driven simulationR3 near optimal (>50% better than existing approaches)Linux implementationFeasibility and efficiency of R3 in real networks64Slide65

11/16/2010

65

11/16/2010

65

What

s Next

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Common-case Optimization with Penalty Envelope

Interdomain Reliability Service

Resilient Routing Reconfiguration

Traffic VariationsSlide66

11/16/2010

66

Any future Internet should attain the

highest possible level of availability

, so that it can be used for

mission-critical activities

, and it can serve the nation

in times of crisis

- GENI, 2006Slide67

11/16/2010

67

The 3 elements which carriers are most concerned about when deploying communication services are:

Network

reliability

Network

usability

Network

fault processing capabilities

Telemark, 2006

The top 3 all belong to reliability!Slide68

11/16/2010

68

Reliability Needs Redundancy

Over-provisioning of bandwidth

Diversity of physical connectivity

Challenge:

significant investments

Extra equipment for over-provisioning

Expense & difficulty in obtaining rights-of-way for diversitySlide69

11/16/2010

69

REIN Overview

RE

liability as an

IN

terdomain Service

Objective

Increase the redundancy available to an IP network at low cost

Basic idea

Observation: IP networks overlap, yet they differ

IP networks provide redundancy for each other

Effects: Sharing improves reliability and reduces costs

Analogy: insurance, airline allianceSlide70

11/16/2010

70

Example: Jan. 9, 2006 of a Major US ISP

Stockton

Rialto

El Palso

Oroville

Los Angles

Dallas

AT&T

Interdomain bypass pathSlide71

11/16/2010

71

How to Make REIN Work: the Details

Why would IP networks share interdomain bypass paths?

Peering, cost-free, or customer-provider

What is the

signaling protocol

to share these paths?

Manual, new protocol, BGP communities

How can an interdomain bypass path be used in the

intradomain forwarding

path?

Interdomain GMPLS

IP tunnelsSlide72

11/16/2010

72

Evaluation

Dataset

US-ISP

hourly PoP-level TMs for a tier-1 ISP (1 month in 2007)

Abilene

5-min router-level TMs on Abilene (6 months: Mar

Sep. 2004)

RocketFuel PoP-level topologies

TE algorithms

TE-R (

robust) [This is COPE with REIN awareness]Oblivious routing/bypassing (

oblivious)Constrained Shortest Path First rerouting (CSPF)Flow-based optimal routing (optimal)Slide73

11/16/2010

73

Why REIN: Connectivity Improvements

Without REIN, as high as 60% of links w/ conn. < 3, esp. in some smaller networks

With <= 7 REIN routes: reduce to < 10%Slide74

11/16/2010

74

Why REIN: Overload Prevention

(Abilene 2-link)

Without REIN, even optimal routing overloads bottleneck links by ~300%.

With 10 interdomain bypass paths of 2Gbps each, REIN reduces MLU to ~80%

Abilene bottleneck link traffic intensity: 2-link failures, Tuesday, August 31, 2004 Slide75

11/16/2010

75

REIN Summary

An interdomain service to improve the redundancy of IP networks at low cost

Significantly improves network reliability, esp. when used with COPE to utilize network resources under failuresSlide76

11/16/2010

76

Outline

Challenges to TE in dynamic environment

COPE/R3

REIN

From algorithms to a toolkitSlide77

11/16/2010

77

Torte

Compute TE routing

Convert Link-based to path-based

Generate router configuration

- Network topology

- Traffic demand matrices

- Penalty envelopes

- Protected link sets

f

Paths with

Traffic

Split ratio

- Global MPLS configuration

- Explicit path configuration

- LSP configuration

- Backup LSP/tunnel

configuration

- Output Juniper/Cisco

configurations

Torte

: A Toolkit for Optimal & Robust TE

http://

www-net.cs.yale.edu/projects/torte/tools/

11/16/2010

77Slide78

11/16/2010

78

Link-based vs. Path-based Routing

Our algorithms compute

link-based

traffic split ratio (for each O-D pair)

e.g

. f

newy

losa

{chic, hous}=0.25

means

link chichous carry 25% traffic from newy to losa.Current MPLS-enabled routers support only path-based traffic split ratioTraffic load on equal-cost LSPs is (typically) proportional to requested bandwidth by each LSPJuniper J-, M-, T- seriesCisco 7200, 7500, 12000 series

11/16/201078Slide79

11/16/2010

79

From Link-based to Path-based

Use the

coverage-based

method developed at Yale LANS to convert link-based routing to path-based routing [Zheng et. al, 2007]

A flow decomposition algorithm to select path one by one until reaching the required coverage (for the O-D pair)

11/16/2010

79Slide80

11/16/2010

80

An Example

11/16/2010

80Slide81

11/16/2010

81

Performance Bound of Path Generation

Theorem

: Given a link-based routing

f

and a

q-coverage

path set for

f

, there is a path-based routing over the q-coverage path set s.t. for any input, the MLU under the path-based routing is bounded by

1/q

of the MLU achieved by

f.Slide82

11/16/2010

82

11/16/2010

82

Summary of Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Resilient Traffic Engineering

Interdomain

Reliability

Service

Resilient Routing Reconfiguration

Traffic VariationsSlide83

11/16/2010

83

Future Directions

Problem of pure MPLS TE

May need to configure many LSPs

80 core routers

~ avg. 5 LSPs per O-D pair

Total of 80 x 80 x 5 = 32000 LSPs!

Configuration overhead

Router memory requirement

Possible solutions

Pick heavy-hitter O-D pairs

Others will just use OSPF default routing

Hybrid OSPF/MPLS TE

How to optimize OSPF other than using heuristicsSlide84

11/16/2010

84

Future Directions (cont’d)

Online adaptation with penalty envelope

Network and/or traffic evolve over time

The “normal network load” may change dramatically in a medium time scale

Diurnal pattern

Weekly pattern

Possible solutions

Gradually adapts to changes in network/traffic condition

Difficulty: maintain robustness during the course of adaptationSlide85

11/16/2010

85

Thank you!Slide86

11/16/2010

86

11/16/2010

86

Summary of Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Common-case Optimization with Penalty Envelope [SIGCOMM’06]

Interdomain Reliability Service [SIGCOMM’07]

Resilient Routing Reconfiguration

Traffic VariationsSlide87

11/16/2010

87

11/16/2010

87

Summary of Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network Infrastructure

Common-case Optimization with Penalty Envelope

Interdomain Reliability Service

Resilient Routing Reconfiguration

Traffic VariationsSlide88

Backup Slides

11/16/2010

88Slide89

11/16/2010

89

Outline

Challenges to TE in dynamic environment

Common-case Optimization with Penalty Envelope

Use traffic variation as target application

Interdomain Reliability Service

ImplementationSlide90

11/16/2010

90

Routing Performance Metrics

Maximum Link Utilization (MLU):

Optimal Utilization

Performance RatioSlide91

11/16/2010

91

COPE Instantiation

C:

convex hull of multiple past TMs

A linear predictor predicts the next TM as a convex combination of past TMs (e.g., EWMA)

Aggregation of all possible linear predictors

the convex hull

X:

all possible non-negative TMs

Can add access capacity constraints or use a bigger convex hull

P

C

(f,d): penalty function for common casesmaximum link utilization: U(f,d)performance ratio: PR(f,d)PX(f,x): penalty function for worst casesmaximum link utilization: U(f,d)performance ratio: PR(f,x)minf maxdC PC(f, d)s.t. (1) f is a routing; and (2) xX:PX(f, x)  PESlide92

11/16/2010

92

Choosing the Penalty Envelope

PE =

 min

f

max

x

X

P

X

(f,x)

1 controls the size of PE w.r.t. the optimal worst-case penalty =1  oblivious routing=  prediction-based TESlide93

11/16/2010

93

COPE Implementation

Collect TMs continuously

Compute COPE routing for the next day by solving a linear program (LP)

Common-case optimization

Common case: convex hull of multiple past TMs

All TMs in previous day + same/previous days in last week

Minimize either MLU or PR over the convex hull

Penalty envelope

Bounded PR over all possible nonnegative TMs

See dissertation for details of our LP formulation

Install COPE routing

Currently done once per day

 an off-line solutionSlide94

11/16/2010

94

US-ISP: Maximum Link Utilization

Common cases: COPE is close to dynamic and much better than others

Unexpected cases: COPE beats even oblivious and is much better than othersSlide95

11/16/2010

95

Abilene: MLU in Common Cases

Common cases: COPE is close to optimal/dynamic and much better than othersSlide96

11/16/2010

96

COPE with Interdomain Routing

Motivation

Changes in availability of interdomain routes can cause significant shifts of traffic within the domain

E.g. when a peering link fails, all traffic through that link is rerouted

Challenges

Point-to-multipoint demands

need to find splitting ratios among exit points

The set of exit points may change

topology itself is dynamicToo many prefixes  cannot enumerate all possible exit point changesSlide97

11/16/2010

97

COPE with Interdomain Routing:

A Two-Step Approach

Apply COPE on an extended topology to derive good splitting ratios

Group dest prefixes with same set of exit points into a

virtual node

Derive

pseudo demands

destined to each virtual node by merging demands to prefixes that belong to this virtual node

Connect virtual node to corresponding peer using

virtual link

with infinite BW

Compute extended topology G

’ as G’ = intradomain topology + peers + peering links + virtual nodes + virtual linksApply COPE to compute routing on G’ for the pseudo demandsDerive splitting ratios based on the routesApply COPE on point-to-point demands to compute intradomain routingUse the splitting ratios obtained in Step 1 to map point-to-multipoint demands into point-to-point demands

intradomain topology

peer

peer

peer

peer

virtual

virtualSlide98

11/16/2010

98

Preliminary Evaluation

COPE can significantly limit the impact of peering link failuresSlide99

11/16/2010

99

Outline

Challenges to TE in dynamic environment

COPE

Interdomain Reliability Service

Use failures as target application

ImplementationSlide100

11/16/2010

100

To Improve Reliability, We Need

Network redundancy

Over-provision of bandwidth

Diversity of physical connectivity

Challenge:

significant investments

Extra equipment for over-provisioning

Expense & difficulty to obtain rights of way for diversity

Our solution: REIN

Also, good traffic engineering algorithms for reliability

Challenge:

scalable, efficient and fast response to topology changes [Resilient Routing Reconf.]Slide101

11/16/2010

101

REIN Business Model: Three Possibilities

Peering

Mutual backup w/o financial settlement

Incentive: improve reliability of both at low cost

Symmetry in backup paths provisioning & usage

Cost-free

One-sided, volunteer and/or public service

Customer-Provider

Fixed or usage-based pricing

Pricing should limit abuseSlide102

11/16/2010

102

Interdomain Bypass Path Signaling

Many possibilities, e.g.,

Manual configuration

A new protocol

Utilize BGP communitiesSlide103

11/16/2010

103

REIN Data Forwarding

Main capability needed: Allow traffic to leave and re-enter a network

not supported under hierarchical routing of the current Internet

REIN forwarding mechanism

Interdomain GMPLS

IP tunneling

Either way, only need agreement b/w neighboring networks

Incrementally deployableSlide104

11/16/2010

104

a1 / A / a1 / REIN_PATH_REQ

BGP Bypass Path Signaling

B provides interdomain bypass paths to A.

Task of A:

discover a path to a1 through B

BGP announcement: Dest. / AS path / Bypass path / Tag

Additional attr.: desired starting point (e.g. a2), bw, etc.

a1

a2

a3

Network A

b1 RIB

a1 / A / a1 / REIN_PATH_REQ

b1

b3

b2

Network B

REIN local policy computes bypass paths

to export: e.g., lightly-loaded paths

a1 / BA / b2,b1,a1 / REIN_PATH_AVAIL

a2 RIB

a1 / BA / b2,b1,a1 /-Slide105

11/16/2010

105

Further Optimization

After an IP network imports a set of such paths, how does it effectively utilize them in

routing computation

?

How to

minimize

the number of such paths?Slide106

11/16/2010

106

Further Optimization: Minimize Interdomain Bypass Paths

Motivation

REIN may provide many alternatives

Only a few may be necessary

Reduce configuration overhead & budget constraints

Step 1: Connectivity objective

Preset connectivity requirement

Cost assoc. w/ interdomain paths

Meet connectivity requirement + minimizing total cost

Formulated as a Mixed Integer Programming (MIP)

Step 2: TE objective

Sort interdomain paths according to a scoring functionGreedy selection until TE has desired performanceSlide107

11/16/2010

107

Traffic Engineering for Reliability (TE-R)

Objectives

Efficient utilization of all redundant resources

Scalable implementable in current Internet

Protection: fast ReRouting for common failure scenarios

Restoration: routing convergence for non-common failure scenarios

VPN QoS guarantee, if possible

a1

a2

a3

Intradomain link

REIN virtual link

Network topology for TE-RSlide108

11/16/2010

108

Our TE-R Algorithm: Features

Robust normal-case routing

f*

Based on COPE [ Wang et al.

06 ]

Guarantee bandwidth provisioning for hose-model VPN under f*

Robust fast rerouting under failures on top of

f*

VPN traffic purely intradomain if possible

Novel coverage-based techniques for computational feasibility and implementability

Use flow-based routing to compute optimal solution

Coverage to generate implementation with performance guaranteeFor details, please see dissertation.Slide109

11/16/2010

109

Why Need a TE-R

(

Abilene 1-link failure

)

CSPF overloads bottleneck link by ~300%

vs.

robust TE-R successfully reroutes all traffic

Abilene bottleneck link traffic intensity: 1-link failures, Tuesday August 31, 2004Slide110

11/16/2010

110

Existing Reliability Techniques

Network redundancy techniques

Link layer techniques, e.g. SONET rings

Pro: fast response; Con: expensive

[ Giroire et al. ’03 ]

IP Restoration

Online: MATE

[ Elwalid et al. ’01 ]

& TeXCP

[ Kandula et al. ’05 ]

Offline: Optimization of IGP weights [ Fortz & Thorup ’03, Nucci et al. ’03 ]

Pro: inexpensive; Con: slow responseMPLS Protection [RFC 3469]Path protection: end-to-endLink protection: fast rerouting (FRR)Supported by modern routers, fast response & affordable costsAll techniques depend on available network redundancyTE for reliabilityRestorable bandwidth guaranteed connections [ Kar et al. ’03, Kodialam et al. ’04, Kodialam & Lakshman ’03 ]Oblivious fast rerouting [ Applegate et al. ’04 ]Overlay for reliabilityRON [ Anderson et al. ’01 ],application-level source routing [ Gummadi et al. ’04 ]Pro: Do not require cooperation from backboneCon: Slower response time (dep. On transport layer timeouts), less visibility into networkSlide111

11/16/2010

111

REIN Path Advertisement MessageSlide112

11/16/2010

112

Outline

Challenges to TE in dynamic environment

TE with dynamic traffic

TE with dynamic topology

REIN

R3

Implementation of proposed TE algorithmsSlide113

11/16/2010

113

R3 Motivation

Network topologies are constantly changing

Unexpected failures

Scheduled maintenance

Multiple changes may overlap in time

Changes happen frequently

Some changes take a long time to recover

Fiber cuts

Maintenance

An IP network usually operates with

multiple (simultaneous) topology changesSlide114

11/16/2010

114

Existing TE Approaches

IP fast re-routing

IPFRR

Multi-topology OSPF

Failure-carrying Packets (FCP)

Pros:

Fast response

Scalable: e.g. FCP works as long as a network remain connected

Cons:

Only guarantee connectivity

No guarantee of quality of response, esp. congestionSlide115

11/16/2010

115

Existing TE Approaches (cont

)

MPLS fast re-routing

Optimal demand-oblivious restoration [Applegate et al, ’04]

COPE reliability routing [REIN TE-R]

Pros:

Fast response

Efficiency: provide some guarantee for quality of response

Cons:

Re-optimize routing for each change scenario

No scalableOn-demand computation: slow response time

Pre-computation: need to keep many protection routings in routersDisruption to even existing traffic not affected by changesSlide116

11/16/2010

116

Our Approach: R3

Objectives

Fast response + little disruption to existing traffic

Guarantee for quality of response

Scalability (not too many protection routings kept in routers)

Our approach

R3 = Resilient Routing Reconfiguration

A single protection routing reconfigured according to current topology (initial + changes)

Pre-computed protection + simple reconfiguration

Performance bound on congestion

A single protection routing works for many change scenariosSlide117

11/16/2010

117

R3 Basic Idea

Consider link (i, j)

If (i,j) fails, at most 10 Mbps traffic needs re-routing

If network can route 10 Mbps

additional

,

fictional

traffic from i to j, failure recovery is guaranteed

As least as much traffic can be carried w/o using (i,j)

For proof, see dissertation

Cap = 10 Mbps

Demand = 6 Mbps

i

j

Addl. 4 Mbps

Demand = 6 Mbps

i

j

Addl. 6MbpsSlide118

11/16/2010

118

R3: Topology-uncertainty Demand

Topology uncertainty => traffic uncertainty

Link capacities as upper bound of traffic to re-route

Other choices:

a given percent of link capacities

Exponential number of topology changes => convex set of topology-uncertainty demands

Integer relaxation

Change s1 ~ topology uncertainty demand x1

Change s2 ~ topology uncertainty demand x2

Change s1 or Change s2 ~ x1 or x2

 x1+ (1- ) x2, 

[0,1]Slide119

11/16/2010

119

R3: Routing Reconfiguration

Link e = (i, j)

Let g

e

be the routing of topology-uncertainty demand from i to j

The protection routing when e fails is a

re-normalization

of g

e

after removing e

ge(e) = 0 (no traffic goes on e)ge(e’) = ge(e’) / [1 – ge(e) ] (scale up to a unit flow)Multiple link failures: reconfigure one-by-oneFinal result is independent of order of reconf.Slide120

11/16/2010

120

Evaluation Methodology

Dataset

US-ISP

hourly PoP-level TMs for a tier-1 ISP (1 month in 2007)

RocketFuel

PoP-level gravity model TMs

Topology changes

All single- and two-link changes

Sampled three- and four-link changes

For US-ISP, single-link + single-maintenance

TE algorithms

Base routing:

R3OSPFProtection routingR3OSPF reconvergence, OSPF link detour, FCPFlow-based optimal routing (optimal) used as referenceSlide121

11/16/2010

121

US-ISP: Worst-case with one change

R3 and OSPF+R3 achieves close-to-optimal performance

All others lead to much higher level of traffic intensitySlide122

11/16/2010

122

US-ISP: One Change Performance Ratio

R3 and OSPF+R3 consistently performs within 30% of optimal

All others lead to much higher performance penalty (>= 260%)Slide123

11/16/2010

123

US-ISP: Two and Three Change

R3 and OSPF+R3 significantly out-performs other algorithmsSlide124

11/16/2010

124

RocketFuel SBC: Two and Three Change

R3 significantly out-performs other algorithms, even OSPF+R3

For SBC, integerated R3 base + protection is necessarySlide125

11/16/2010

125

RocketFuel Level-3: Two and Three Change

R3 and OSPF+R3 have similar performance, and significantly out-performs other algorithms

For Level-3, a good OSPF + R3 is enoughSlide126

11/16/2010

126

R3 Summary

R3

A single protection routing that can be reconfigured to recover from multiple topology changes

Simple reconfiguration upon changes

Performance guarantee

A single protection routing works for many possible changes

Ongoing & future work

Implementation of routing reconfiguration on routers

Preventing transient loops during simultaneous reconfigurationsSlide127

11/16/2010

127

Generate MPLS configurations

Configure LSPs for each O-D pair

Create explicit LSP path

Requested bandwidth of the LSP is proportional to the (path-based) traffic split ratio

Configure backup LSPs/tunnels for each protected link (to implement R3)

Create explicit backup LSP/tunnel

Requested bandwidth of the backup LSP/tunnel is proportional to the backup traffic split ratio

11/16/2010

127Slide128

11/16/2010

128

Future Directions on COPE

COPE with OSPF

COPE with online TE

COPE for other network optimization problemsSlide129

11/16/2010

129

Future Directions on REIN

A thorough study of the effects of cross-provider shared-risk link group data

Effectiveness of REIN on smaller IP networks

Improving TE robustness under dynamic topologySlide130

11/16/2010

130

Our Approaches

REIN

Cost-effective way to increase redundancy

R3

Scalable, congestion-free fast reroutingSlide131

11/16/2010

131

Summary of Contributions

Traffic Variations

Topology Changes

ISP Objectives

Network infrastructure

Traditional TE approaches

Network infrastructure

Interdomain

Reliability

Service

Common-case Optimization with Penalty Envelope

Resilient Routing Service

Resilient Routing Reconfiguration

Traffic VariationsSlide132

11/16/2010

132

What

s next

Traffic Variations

Topology Changes

ISP Objectives

Network infrastructure

Interdomain

Reliability

Service

Common-case Optimization with Penalty Envelope

Resilient Routing Service

Resilient Routing Reconfiguration

Traffic VariationsSlide133

11/16/2010

133

What

s next

Traffic Variations

Topology Changes

ISP Objectives

Network infrastructure

Interdomain

Reliability

Service

Common-case Optimization with Penalty Envelope

Resilient Routing Service

Resilient Routing Reconfiguration

Traffic VariationsSlide134

11/16/2010

134

Summary

Traffic Variations

Topology Changes

ISP Objectives

Network infrastructure

Interdomain

Reliability

Service

Common-case Optimization with Penalty Envelope

Resilient Routing Service

Resilient Routing Reconfiguration

Traffic VariationsSlide135

11/16/2010

135

Why REIN: Overload Prevention (

US-ISP failure log

)

REIN can reduce normalized traffic intensity by 118% ~ 35%, depending on the TE-R algorithms used.

Improvement of traffic intensity by REIN for a week in January 2007 for US-ISPSlide136

136

R3 Backup

Slides

136Slide137

137

US-ISP Single Failure: Zoom In

R3 achieves near-optimal performance (R3 vs opt)

For each hour (in one day), compare the worst case failure scenario for each algorithm

137Slide138

Level-3 Multiple Failures

Left: all two-failure scenarios; Right: sampled three-failure scenarios

138

R3 out-performs other schemes by >50%Slide139

SBC Multiple Failures

Left: all two-failure scenariosRight: sampled three-failure scenarios139

“Ideal” R3 outperforms OSPF+R3 in some casesSlide140

Robustness on Base Routing

OSPFInvCap: link weight is inverse proportional to bandwidthOSPF: optimized link weightsLeft: single failure; Right: two failures140

A better base routing can lead to better routing protection Slide141

141

Flow RTT (Denver-Los Angeles)

R3 implementation achieves smooth routing protection

141Slide142

142

R3: OD pair throughput

Traffic demand is carried by R3

under multiple link failure scenarios

142Slide143

143

R3: Link Utilization

Bottleneck link load is controlled under 0.37 using R3

143Slide144

Offline Precomputation Solution

[C2] contains infinite number of constraints due to xConsider maximum “extra” load on link e caused by x

Σl∈Exlpl(e) ≤ UB, if there exists multipliers πe(l) and λe (LP duality):Convert [C2] to polynomial number of constraints: 144

π

e

(l)

λ

e

& constraints on

π

e

(l)

and λeSlide145

145

Traffic Priority

Service priority is a practical requirement of routing protection

Traffic Priority Example

TPRT (real-time IP) traffic

should be congestion-free under up-to-3 link failures (to achieve 99.999% reliability SLA)

Private Transport (TPP) traffic

should survive up-to-2 link failures

General IP traffic

should only survive single link failures

145Slide146

146

R3 with Traffic Priority

Attribute

protection level

to each class of traffic

TPRT protection level 3

TPP protection level 2

IP protection level 1

Traffic with protection level greater than equal to i should survive under failure scenarios covered by protection level i

146Slide147

147

R3 with Traffic Priority Algorithm

Precomputation with Traffic Priority

Consider protection for each protection level i

Guarantee each class of traffic has no congestion under the failure scenarios covered by its protection level

147Slide148

148

R3 with Traffic Priority Simulation

Routing Protection Simulation

Basic R3

vs

R3 with Traffic Priority

Methodology

US-ISP: a large tier-1 operational network topology

hourly PoP-level TMs for a tier-1 ISP (1 week in 2007)

extract IPFR and PNT traffic from traffic traces, subtract IPFR and PNT from total traffic and treat the remaining as general IP

Protection levels:

TPRT: up-to-4 link failures

TPP: up-to-2 link failures

IP: single link failures

Failure scenarios:

enumerated all single link failures

100 worst cases of 2 link failures

100 worst cases of 4 link failures

148Slide149

Traffic Protection Priority

IP: up-to-1 failure protection;TPP: up-to-2 failure protection;TPRT: up-to-4 failure protectionLeft: single failure;Righttop: worst two failures;Rightbottom: worst four failures

149

R3 respects different traffic protection prioritiesSlide150

Linux Implementation

MPLS-ff software routerBased on Linux kernel 2.6.25 + Linux MPLSImplement flow-based routing for efficient R3 online reconfigurationExtend MPLS FWD (forward) data structure to enable per hop traffic splitting for each flowFailure detection and responseDetection using Ethernet monitoring in kernelIn operational networks, detection can be conservative (for SLRG, MLG)Notification using ICMP 42 floodingRequires reachability under failuresRerouting traffic by MPLS-ff label stacking

Online reconfiguration of traffic splitting ratios (locally at each router)150Slide151

151

MPLS-ff

R3 uses

flow-based

traffic splitting (for each OD pair)

e.g

. p

newy

wash

(chic

hous)=0.2

means

link chic

hous will carry 20% traffic originally carried by newy

wash when newy

wash fails

Current MPLS routers only support

path-based

traffic splitting

Traffic load on equal-cost LSPs is proportional to requested bandwidth by each LSP

Juniper J-, M-, T- series and Cisco 7200, 7500, 12000 series

e.g. multiple paths from newy to wash can be setup to protect link newy

wash

151

151Slide152

152

MPLS-ff

Convert flow-based to path-based routing

e.g., using flow decomposition algorithm

-- [Zheng et al]

Expensive to implement R3 online reconfiguration

Need to recompute and signal LSPs after each failure

Extend MPLS to support flow-based routing

MPLS-ff

Enable next-hop traffic splitting ratios for each flow

flow: traffic originally carried by a protected link

152

152Slide153

153

MPLS-ff

MPLS FWD data structure

Extended to support

multiple

Next Hop Label Forwarding Entry (NHLFEs)

One NHLFE specifying one neighbor

One Next-hop Splitting Ratio for each NHLFE

Label 200 FWD:

NHLFE R2: 50%

NHLFE R3: 50%

NHLFE R4: 0%

R1

R2

R4

R3

153Slide154

154

MPLS-ff

Implement Next-hop Splitting Ratio

router i, next-hop j, protected link (a,b):

Packet hashing

Same hash value for packets in the same TCP flow

Independent hash values on different routers for any particular TCP flow

Hash of (

packet flow fields + router id

)

Hash = f(src, dst,

srcPort, dstPort)

FWD to j if 40<Hash<64

all packets from i: 40<Hash<64!

(skewed hash value distribution)

i

j

154Slide155

155

Failure Response

Failure Detection and Notification

detection: Ethernet monitoring in Linux kernel

notification: ICMP 42 flood

Failure Response

MPLS-ff label stacking

Protection Routing Update

R3 online reconfiguration

requires a local copy of

p

on each router

155Slide156

R3 Design: Routing Model

Network topology as graph G = (V,

E)‏V: set of routersE: set of directed network links, link e=(i,j) with capacity ceTraffic matrix (TM)‏TM d is a set of demands: d = { dab | a,b ∈

V

}

d

ab

:

traffic demand from a to

b

Flow-based routing representation

r

= { rab(e) | a,b ∈ V,

e ∈ E }rab(e): the fraction of traffic from a to b (dab) that is carried by ink ee.g. rab(e)=0.25 means link e carry 25% traffic from a to b156Slide157

Intuition behind R3

Plan rerouting on the single original topology

Avoid enumeration of topologies (failure scenarios)Compute (r,p) to gurantee congestion-freefor d + x∈XF on GPuzzlesAdd rerouted traffic before it appearsXF = { x | 0 ≤ xl ≤ cl

,

Σ

(

x

l

/cl) ≤ F

}

superset of rerouted traffic under failures

Use network resource that will disappear

Protection routing

p uses links that will fail

157

By doing two

counter-intuitive

things, R3 achieves:

Congestion-free under multiple link failures w/o enumeration of failure scenarios

Optimal for single link failure

Fast rerouting

Not

real traffic demand!

Topology with links that will

fail

!Slide158

Step 1: Fast rerouting after

ac fails:Precomputed p for link ac:pac

(ac)=0.4, pac(ab)=0.3, pac(ad)=0.3ac fails  0.4 need to be carried by ab and adpac: ac 0, ab 0.5, ad 0.5a locally rescales pac and activates fast reroutingEfficiently implemented by extending MPLS label stacking

2/10

2

+2

/10

Online Reconfiguration

158

0/10

0/10

2/10

0

+2

/10

0

+2

/10

2+

4

/10

4/5

4

-4

/5

Fast rerouting

load

/

capacity

a

b

c

d

0.3

0.3

0.3

0.6

0.4

p

ac

(

e

)

a

b

c

d

0

0.5

0.5

0.5

1