Hao Wang and Bill Lin University of California San Diego HSPR 2010 Dallas HPSR June 1316 2010 Packet Buffer in Routers Scheduler and Packet Buffers i n Input linecards have ID: 733394
Download Presentation The PPT/PDF document "Block-Based Packet Buffer with Determini..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Block-Based Packet Buffer with Deterministic Packet Departures
Hao
Wang and Bill Lin
University of California, San Diego
HSPR 2010, DallasSlide2
HPSR, June 13-16, 2010
Packet Buffer in Routers
Scheduler
and
Packet Buffers
in
Input
linecards
have
40byte @ 40Gbps
= 8ns to read and write a
packet.
Routers
need to store the packets to deal with congestionBandwidth X RTT = 40Gb/s*250ms = 10Gb buffer.Too big to store in SRAM, hence need to use DRAM.Problem: DRAM access time ~40ns. Roughly 10x speed difference.
in
i
n
out
out
out
Hao Wang and Bill Lin
2Slide3
HPSR, June 13-16, 2010
Parallel and Interleaved
DRAM
DRAM banks
Assume
DRAM-to-SRAM access latency ratio is
3
P
P
P
P
P
P
Hao Wang and Bill Lin
3Slide4
HPSR, June 13-16, 2010
Problems with Parallelism
Access patterns may
create problems.
To access 3, 6, 9 and 11 one after another, it is possible to issue interleaved read requests and read those packets out at line rate.
DRAMs
1
3
14
11
10
6
5
4
8
9
13
12
2
7
Hao Wang and Bill Lin
4Slide5
HPSR, June 13-16, 2010
Problems with Parallelism
But, accessing
2 & 3
or 10 & 11 in succession is problematic.This is an example of Packet Access Conflict
DRAMs
1
3
14
11
10
6
5
4
8
9
13
12
2
7
Hao Wang and Bill Lin
5Slide6
HPSR, June 13-16, 2010
Use
Packet
Departure Time
Wide classes of routers (Crossbar Routers) where the packets departures are determined by the scheduler on the fly.Packet buffers which cater to these routers exist but are complexThere are other high performance routers such as Switch-Memory-Switch, Load Balanced Routers for which packet departure time can be calculated when the packet is inserted in the buffer.
Hao Wang and Bill Lin6Slide7
Solution
We will use the known departure times of the packets to schedule them to different DRAM banks such that there won’t be any conflicts at arrival or departure.
HPSR, June 13-16, 2010
Hao Wang and Bill Lin
7Slide8
HPSR, June 13-16, 2010
Packet Buffer Abstraction
Fixed sized packets, time is slotted (Example: 40Gb/s, 40 byte packet => 8ns).
The buffer may contain arbitrary large number of logical queues, but with deterministic access.
Single-write Single-read time-deterministic packet buffer model.
Hao Wang and Bill Lin8Slide9
HPSR, June 13-16, 2010
Packet Buffer Architecture
Interleaved memory architecture with multiple slower DRAM banks.
K
slower DRAM banksb time slots to complete a single memory read or write operationb consecutive time slots is a frame
Each bank is segmented into several sectionsMemory block is a collection of sections
Hao Wang and Bill Lin
9Slide10
HPSR, June 13-16, 2010
Proposed Architecture
……
reservation table
D
1
DRAMs
arriving packets
departing packets
bypass buffer
departure reorder
buffers
1
2
K
…
…
…
…
…
…
…
1
2
M
1
2
b
…
…
…
…
…
…
…
1
2
N
1
2
b
…
…
…
…
…
…
…
1
2
N
…
D
2
D
K
memory block
Hao Wang and Bill Lin
10Slide11
HPSR, June 13-16, 2010
Reservation Table
…
19
24
20
22
…
21
20
1
2
3
4
5…
K
0
0
1
1
…
3
3
…
blocks
1
i
23
25
22
20
…
24
19
2
Hao Wang and Bill Lin
11
Use a counter of size
log
2
N
bits to keep track of the actual number of packets in
N
packet locations.
Reduce the size of the reservation table bySlide12
HPSR, June 13-16, 2010
Packet Access Conflicts
Arrival conflicts
An arriving packet keeps a bank busy for
b cyclesNeed b-1 additional banksDeparture conflicts
It takes b cycles to read a packet to outputNeed b additional banks.Overflow conflicts
Incoming packets with departure times within N frames are stored in the same memory block
N×
b arrivals, however, each memory section stores at most N packets
Hao Wang and Bill Lin
12Slide13
memory section
Water-filling Algorithm
HPSR, June 13-16, 2010
Hao Wang and Bill Lin
13
busy
…
memory block
occupied
most empty available bank
A memory block is managed
by a row
of
the reservation tableSlide14
HPSR, June 13-16, 2010
Packet Access Conflicts
Water-filling Algorithm
Pick the most empty bank to store the arriving packet
Solve overflow conflictsTheorem: With at least 3b-1 DRAM banks, it is always possible to admit all the arrival packets and write them into memory blocks based on their departure times.
Hao Wang and Bill Lin
14Slide15
HPSR, June 13-16, 2010
DRAM Selection Logic
…
17
19
24
23
…
26
20
1
2
3
4
5
…
K
K columns
…
M rows
s
20
16
19
22
…
23
25
s+1
1
0
0
1
0
…
0
write candidate vector W
15
∞
19
∞
…
∞
∞
m=3
X
…
15
22
19
21
…
20
23
s+u
reservation table R
Hao Wang and Bill Lin
15Slide16
HPSR, June 13-16, 2010
Packet Arrival
…
17
19
24
23
…
26
20
1
2
3
4
5
…
K
K columns
…
M rows
s
20
16
19
22
…
23
25
s+1
1
0
0
1
0
…
0
write candidate vector W
15
∞
19
∞
…
∞
∞
m=3
X
…
15
22
19
21
…
20
23
s+u
reservation table R
Hao Wang and Bill Lin
16
Use write candidate vector
W
to check arrival conflicts and departure conflictsSlide17
HPSR, June 13-16, 2010
Packet Arrival
…
17
19
24
23
…
26
20
1
2
3
4
5
…
K
K columns
…
M rows
s
20
16
19
22
…
23
25
s+1
1
0
0
1
0
…
0
write candidate vector W
15
∞
19
∞
…
∞
∞
m=3
X
…
15
22
19
21
…
20
23
s+u
reservation table R
Hao Wang and Bill Lin
17
Pick the most empty bank to store the incoming packetSlide18
HPSR, June 13-16, 2010
Packet Departure
Hao Wang and Bill Lin
18
Packets in a memory block are moved to one of the departure reorder buffers before their departure times.
Pick the fullest memory section first upon departure
It is always possible to read all the packets from
a memory section
even if the section is full
All packets are guaranteed to depart on time. Slide19
HPSR, June 13-16, 2010
SRAM Bypass Buffer
The worst case of the minimum round-trip latency for storing and retrieving a packet to and from one of the DRAM banks is
(
2N+1)×b time slots.
A bypass buffer to store packets with departure times shorter than (2N+1)×
b time slots away.Hao Wang and Bill Lin
19
…
arriving packets
departing packets
packet locator
…
…
head pointerSlide20
HPSR, June 13-16, 2010
SRAM
Requirement (in MB)
Hao Wang and Bill Lin
20
N is the number of packets represented by one entry in the reservation table. Line rate is 100Gb/s
N
reservation table
departure buffers
bypass buffer
TOTAL130
0.010.0130.01
324.69
0.040.044.77
642.820.080.082.971281.650.160.161.962560.940.320.321.57
5120.530.630.631.78
10240.30
1.251.262.80Slide21
HPSR, June 13-16, 2010
SRAM
Requirement Comparison
Hao Wang and Bill Lin
21
Line rate is 40Gb/s
. RTT
250 ms.
b = 16.
K = 3
b-1
Average
packet size 40 bytes
The total SRAM size in our proposed block-based packet buffer is only 8.3% of the previous frame-based scheme and 1.6% of the state-of-the-art SRAM/DRAM
prefetching buffer scheme.prefetching-basedframe-basedThis paper64 MB12 MB1 MBSlide22
HPSR, June 13-16, 2010
Conclusion
Packet
buffer architecture
with deterministic packet departure, e.g., Switch-Memory-Switch and Load-Balanced Routers.SRAM requirement grows logarithmically with the line rate.Required number of DRAM banks is a small constant independent of the arrival traffic patterns, the number of flows and the number of priority classes. Scalable to growing packet storage requirements in future routers while matching increasing line rates
Hao Wang and Bill Lin22Slide23
Thank You for
Your Kind Attention