Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering University of California San Diego NetworksOnChip Chipmultiprocessors CMPs increasingly popular Torus Mesh Flattened Butterfly candidate architectures for onchip networks ID: 630694
Download Presentation The PPT/PDF document "Weighted Random Oblivious Routing on Tor..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Weighted Random Oblivious Routing on Torus Networks
Rohit Sunkam Ramanujam
Bill Lin
Electrical and Computer Engineering
University of California, San DiegoSlide2
Networks-On-Chip
Chip-multiprocessors (CMPs) increasingly popular
Torus, Mesh, Flattened Butterfly – candidate architectures for on-chip networks
Intel Larrabee
Tilera Tile64Slide3
Networks-On-Chip
Chip-multiprocessors (CMPs) increasingly popular
Torus
, Mesh, Flattened Butterfly – candidate architectures for on-chip networks
Folded Torus
2D TorusSlide4
Routing Algorithm Wishlist
Ideal
Optimum worst-case
throughput
✔
Low latency
✔
Good average-case throughput
✔
Easy to guarantee deadlock freedom
✔
Low implementation complexity
✔
Closed-form algorithmic description
✔Slide5
Outline
Motivation
Related Work
Optimal routing for rings
Optimal routing for 2D torusSlide6
Optimal Oblivious Routing
Cast as a
Multi-commodity flow
problemMaximize worst-case throughput
Minimize hop-countSolve using Linear ProgrammingImpractical for large networksNumber of paths too large (exponential)Hard to make it deadlock-free
LP not scalableSlide7
Optimal Oblivious Routing
Ideal
Optimal Oblivious
Optimum worst-case
throughput
✔
✔
Low latency
✔
✔
Good average-case throughput
✔
✔
Easy to guarantee deadlock freedom
✔
X
Low implementation
complexity
✔
X
Closed-form algorithmic description
✔
XSlide8
Optimal 2TURN
Optimum oblivious routing with only 2TURN paths.
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0Slide9
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
Optimal 2TURN
Optimum oblivious routing with only 2TURN paths.
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0Slide10
Optimal 2TURN
Ideal
Optimal Oblivious
Optimal
2TURN
Optimum worst-case
throughput✔
✔
✔
Low latency
✔
✔
✔
Good average-case throughput
✔
✔
✔
Easy to guarantee deadlock freedom
✔
X
✔
Low implementation complexity
✔
X
X
Closed-form algorithmic description
✔
X
XSlide11
Valiant Load Balancing
(VAL)
2 phases of
X-Y routing
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0Slide12
Improved Valiant Routing (IVAL)
Phase1:
X-Y
, Phase2:
Y-X
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0Slide13
Improved Valiant Routing (IVAL)
Phase1:
X-Y
, Phase2:
Y-X
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0Slide14
VAL and IVAL
Ideal
Optimal Oblivious
Optimal
2TURN
VAL
IVAL
Optimum worst-case
throughput
✔
✔
✔
✔
✔
Low latency
✔
✔
✔
X
X
Good Average-case throughput
✔
✔
✔
X
✔
Deadlock freedom
✔
X
✔
✔
✔
Low implementation
complexity
✔
X
X
✔
✔
Closed-form description
✔
X
X
✔
✔Slide15
Latency Comparison
13.5%Slide16
Evolution of W2TURN
Step 1
. Started with the simple case of 1D rings
Developed Weighted Random Direction (WRD)
Step 2
. Described 2TURN paths in IVAL in terms of routing on 1D segments (I2TURN)I2TURN has analytical expression for hop count.Step 3
. Combined the intuition gained from WRD, I2TURN and optimal 2TURNDeveloped Weighted random 2TURN routing (W2TURN)
Analytically showed latency of W2TURN strictly better than I2TURNSlide17
Outline
Motivation
Related Work
Optimal routing for rings
Optimal routing for 2D torusSlide18
Routing on Rings
Randomized Load Balancing (RLB) – Optimal worst-case throughput for rings
Same routing strategy for both odd and even radix networksSlide19
Some Facts …
Worst-case throughput determined by maximum channel load under most adversarial traffic
For a torus network with radix
k
,Maximum channel for worst-case throughput optimality = k/4 Even k
= k/4 – 1/4k Odd kSlide20
Rings – The Difference Between
Odd
and Even
RLB: Route minimally with probability (k-∆)/kWhy can’t we route minimally more often?
Total Channel load = (k-1)/2 * (k+1)/2k =
k/4 - 1/4k
= Maximum load for worst-case throughput optimality
Tornado traffic
∆ = (k-1)/2Slide21
Rings – The Difference Between Odd and
Even
RLB: Route
minimally
with probability (k-∆)/k.Can we route minimally more often?
Total Channel load = (k/2 – 1) * (k+2)/2k = k/4 – 1/k < Maximum load for worst-case throughput optimality
Tornado traffic
∆ = k/2-1
Route minimally with a probability of (k-∆-1)/(k-2) > (
k-∆)/kSlide22
WRD Algorithm
Odd radix:
Route minimally with probability (k-∆)/k
Route non-minimally with probability ∆/kEven radix:
Route minimally with probability (k-∆-1)/(k-2) when k > 2 and ∆ > 0Route non-minimally with probability (∆-1)/(k-2) when k > 2 and ∆ > 0Slide23
Latency Evaluation
25%Slide24
WRD=OptimalSlide25
WRD - Ideal for 1D Rings
Ideal
WRD
Optimum worst-case
throughput
✔
✔
Low latency
✔
✔
Good average-case throughput
✔
✔
Easy to guarantee deadlock freedom
✔
✔
Low implementation complexity
✔
✔
Closed-form algorithmic description
✔
✔Slide26
Outline
Motivation
Related Work
Optimal routing for rings
Optimal routing for 2D torusSlide27
I2TURN
Describe 2TURN paths in terms of 1D segments.
2TURN paths:
X-Y-X
or
Y-X-Y
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
X-Y-X
routing
Select intermediate X position
x
* at uniform random
Route minimally to
x
*
Route using RLB on the Y ring at X=
x
*Slide28
I2TURN
Describe 2TURN paths in terms of 1D segments.
2TURN paths:
X-Y-X
or
Y-X-Y
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
X-Y-X
routing
Select intermediate X position
x
* at uniform random
Route minimally to
x
*
Route using RLB on the Y ring at X=
x
*
1/4Slide29
I2TURN
Describe 2TURN paths in terms of 1D segments.
2TURN paths:
X-Y-X
or
Y-X-Y
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
X-Y-X
routing
Select intermediate X position
x
* at uniform random
Route minimally to
x
*
Route using RLB on the Y ring at X=
x
*
Route minimally to the destination
3/4
1/4Slide30
I2TURN – Main Idea
For XYX routing,
load balance across the Y-rings
to make traffic along every Y-ring admissible
Use worst-case throughput optimal routing (RLB) on the Y-ringCan easily derive analytical expression for average packet latency
Can be proved to be equivalent to IVAL. Hence, it is worst-case throughput optimalCan define YXY routing by swapping dimensionsSlide31
W2TURN – Even Radix
Reduces latency
over I2TURN
Use WRD
instead of RLBInterpolate X-Y-X and Y-X-Y 2TURN routing with minimal X-Y and Y-X routingXYX : k/2(k+1)
YXY : k/2(k+1)XY: 1/2(k+1)YX: 1/2(k+1)Slide32
X-Y-X W2TURN
1,2
2,2
3,2
1,1
2,1
3,1
0,2
1,3
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
X-Y-X
routing
Select intermediate X position
x
* at uniform random
Route minimally to
x
*
Route using WRD on the Y ring at X=
x
*Slide33
1,3
X-Y-X W2TURN
1,2
2,2
3,2
1,1
2,1
3,1
0,2
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
X-Y-X
routing
Select intermediate X position
x
* at uniform random
Route minimally to
x
*
Route using WRD on the Y ring at X=
x
*
1Slide34
1,3
X-Y-X W2TURN
1,2
2,2
3,2
1,1
2,1
3,1
0,2
2,3
3,3
0,0
0,1
0,3
1,0
2,0
3,0
X-Y-X
routing
Select intermediate X position
x
* at uniform random
Route minimally to
x
*
Route using WRD on the Y ring at X=
x
*
Route minimally to the destination
1
When number of hops in both directions are equal, avoid using links used by minimal X-Y or Y-X routing.Slide35
W2TURN – Odd Radix
W2TURN = Optimal 2TURN
for odd radix
More elaborate description but easy to implement
Uses X-Y-X and Y-X-Y 2TURN routing with equal probabilityMost of the intuition gained by observing optimal 2TURN pathsSlide36
Latency Evaluation
13.5%Slide37
W2TURN ≈ Optimal-2TURN
W2TURN = Optimal-2TURN for odd radix
W2TURN within 0.72% of Optimal-2TURN for even radix Slide38
Back to our Wishlist …
Ideal
Optimal Oblivious
Optimal
2TURN
VAL
IVAL
W2TURN
Optimum worst-case
throughput
✔
✔
✔
✔
✔
✔
Low latency
✔
✔
✔
X
X
✔
Good average-case throughput
✔
✔
✔
X
✔
✔
Easy to guarantee deadlock freedom
✔
X
✔
✔
✔
✔
Low implementation
complexity
✔
X
X
✔
✔
✔
Closed-form algorithmic description
✔
X
X
✔
✔
✔Slide39
Summary of Contributions
WRD:
Optimal routing algorithm for rings
Worst-case throughput optimal
Minimum hop countW2TURN-Odd: Optimal 2TURN routing with a closed form description for 2D torus with odd radix
W2TURN-Even: Latency within 0.072% of optimal 2TURN routing for 2D torus with even radixWRD and W2TURN are
best performing closed-form algorithms for 1D and 2D torus!! Slide40
Thank You !!Slide41
Average case throughputSlide42
Proof of worst-case throughput optimality
Optimal worst-case channel load = 2*(Channel load for uniform traffic)
To prove a routing is worst-case throughput optimal, sufficient to prove that maximum channel load:
= k/4 when k is even.
= k/4 – 1/4k when k is odd.