Aditya Akella UWMadison Growing traffic vs network performance Network traffic volumes growing rapidly Annual growth overall 45 enterprise 50 data center 125 mobile 125 Growing strain on installed capacity everywhere ID: 358867
Download Presentation The PPT/PDF document "Redundancy elimination as a network serv..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Redundancy elimination as a network service
Aditya Akella
UW-MadisonSlide2
Growing traffic vs. network performance
Network traffic volumes growing rapidly
Annual growth: overall (45%), enterprise (50%), data center (125%), mobile (125%)*
Growing strain on installed capacity everywhereCore (Asian ISPs – 80-90% core utilization), enterprise access, data center, cellular, wireless…How to sustain robust network performance?
* Interview with Cisco CEO, Aug 2007, Network world
Enterprises
Mobile
users
Home users
Video
Data
centers
Web content
Other
svcs
(backup)
ISP
core
Strain on installed
link capacities
2Slide3
Enterprises
Scale link capacities by eliminating redundancy
Popular idea:
duplicate suppression or
redundancy elimination (RE)
Popular objects, partial content matches, backups, app headersEffective capacity improves ~2XMany approaches to REApplication-layer cachesProtocol-independent redundancy elimination (RE)Below app-layerWAN accelerators, de-duplication
Content distribution, bittorrent
Point solutions apply to specific link, protocol, or app
Mobile
users
Home users
Video
Data
centers
Web content
Other
svcs
(backup)
Wan
Opt
Wan
Opt
Dedup
/
archival
Dedup
/
archival
ISP HTTP
cache
CDN
3Slide4
Universal need to scale capacities
4
Wan
Opt
Wan
Opt
Dedup
/
archival
Dedup
/
archival
ISP HTTP
cache
Network Redundancy
Elimination Service
Point solutions inadequate
Candidate:
RE as a primitive operation supported inherently in
the network
RE
the new
narrow waist
Applies transparently to all links, flows (long/short), apps,
unicast
/multicast, protocols
Architectural support to address universal need to scale capacities?
Bittorrent
Point solutions:
Little or no benefit
in the core
✗
Point solutions:
Other links must
re-implement specific
RE mechanisms
✗
Point solutions:
Only benefit
system/app
attachedSlide5
Internet2
Packet cache
at every router
Apply protocol-
indep
RE at the packet-level on network links
IP-layer RE service
5
Wisconsin
Berkeley
CMU
Router upstream removes
redundant
bytes
Router downstream reconstructs
full
packet
IP layer RE using router packet caches
Leverage rapidly declining storage/memory costsSlide6
RE as a network service: Why?
Improved performance everywhere even if partially enabled
Generalizes point deployments
and app-specific approachesBenefits all network end-points, applications, scales capacities universallyBenefits network coreImproved switching capacity, responsiveness to sudden overloadOther application domains: data centers, multi-hop wirelessArchitectural benefitsEnables
new protocols and appsMin-entropy routing, RE-aware traffic engineering (intra- and inter-domain)
Anomaly detection, in-network spam filteringImproves apps: need not worry about using network efficientlyApp headers can be verbose
better diagnosticsControlling duplicate transmission in app-layer multicast is a non-issue6Slide7
Internet2
7
Network RE
12
pkts
(ignoring tiny packets)
Without RE
18
pkts
33% lower
Wisconsin
Berkeley
CMU
Generalizes point
deployments
Benefits the network: improves effective switching capacity
6
2 packets
32 packets
32 packets
Implications example: Performance benefitsSlide8
Wisconsin
Internet2
8
RE + routing
10
pkts
Simple RE
12
pkts
Berkeley
CMU
Verbose control messages
New video adaptation algorithms
Anomaly detectors
✓
Spam filtering
✓
Content distribution schemes
Minimum-entropy routing
New, flexible traffic engineering
mechanisms
Inter-domain protocols
Implications example: New protocolsSlide9
Talk outline
Is there promise today?
Empirical study of redundancy in network traffic
Extent, patternsImplications for network REIs an IP-level RE service achievable today? Network-wide RE architectureGetting RE to work on ISP routersWhat next?Summary and future directions9Slide10
Redundancy in network traffic
*
10
*Joint work with: Ashok Anand, Chitra Muthukrishnan (UW-Madison)Ram Ramjee (MSR-India)Slide11
Empirical study of RE
Upstream cache = content table + fingerprint index
RE algorithms
Content-based names (“fingerprints”) for chunks of bytes in payloadFingerprints computed for content, looked up to identify redundancyDownstream cache: content tableQuestionsHow far are existing RE algorithms from optimal? Do better schemes exist?Fundamental redundancy patterns and implications for packet caches
Cache
Cache
WAN link
Data center
Enterprise
11Slide12
Analysis approach
11 Enterprises (3 TB)
Small (10-50 IPs)
Medium (50-100 IPs)Large (100+ IPs)Protocol compositionHTTP (20-55%), File sharing (25-70%)
University link (1.6 TB)Large university trace (10K IPs)Outgoing /24, web server traffic
Protocol compositionIncoming, HTTP 60% Outgoing, HTTP 36%12
Emulate memory-bound (100 MB - 4GB) WAN optimizerEmulate only redundancy eliminationCompute bandwidth savings as (saved bytes/total bytes)
Includes packet headers in total bytes
Includes overhead of shim headers used for encoding
Packet traces
Set-upSlide13
RE algorithms: MODP
Spring et al. [Sigcomm 2000]
Compute fingerprints
Packet payloadWindow
(w)
Rabin fingerprinting
Value sampling:
sample those fingerprints
whose value is 0 mod p
Fingerprint table
Packet store
Payload-1
Payload-2
Lookup fingerprints in fingerprint table, derive
maximal
match across packetsSlide14
RE algorithms: MAXP
Similar to MODP
More robust selection criteria
14 MAXPChoose fingerprints that are local maxima for p-byte region
MODP
Sample those fingerprints whose value is 0 mod p
No fingerprint to represent the shaded region
Gives uniform selection of fingerprintsSlide15
Comparison of RE algorithms
Trace is 68% redundant!
MAXP outperforms MODP by 5-10% in most cases
Uniform sampling approach of MAXPMODP loses due to non uniform clustering of fingerprints15
(Store all
FPs
in a Bloom filter)Slide16
Comparison of RE algorithms
GZIP offers 3-15% benefit
(10ms buffering) -> GZIP better by 5%
MAXP significantly outperforms GZIP, offers 15-60% bandwidth savingsMAXP -> (10 ms) -> GZIP better by up to 8%16Slide17
Zipf-like distribution for chunk matches
Unique chunk matches sorted by their hit counts
Zip-
fian distribution, slope = -0.97Popular chunks are content fragments < 150B in size17
80% of savings come from 20% of chunks
Need to index 80% of chunks for remaining 20% of savings
Small cache size should capture most benefits?Slide18
Cache size
18
Small caches can provide significant savings
Diminishing returns for increasing cache size after 250 MBBuild packet caches today using DRAM on routersSlide19
Empirical study: Summary
Significant redundancy in network traffic
Careful selection of content fingerprints necessary
Zipf distribution of content popularity; small matches importantRelatively small caches sufficient19Slide20
SmartRE:
Effective router-level RE
*
20*Joint work with Ashok Anand (UW-Madison) and Vyas Sekar (CMU)Slide21
Realizing RE as a network service
Building blocks to realize network RE service?
Goal: optimal performance, or, maximal reduction in traffic footprint
Leverage all possible RE (e.g., inter-path)Leverage resources optimallyCache capacity: finite memory (DRAM) on routersProcessing constraints: enc/dec are memory-access limited – can only run at a certain maximum speed
21Slide22
Hop-by-hop RE revisited
22
Leverage
all RE
✔
Cache
Constraints
✖
Processing Constraints
✖
Encode
Decode
Encode
Encode
Encode
Encode
Decode
Decode
Decode
Decode
Same packet
encoded and decoded many times
Same packet
cached
many times
Limited throughput:
Encoding: ~15
mem
. accesses ~2.5
Gbps
(@ 50ns DRAM)
Decoding: ~3-4 accesses > 10
Gbps
(@ 50ns DRAM) Slide23
RE at the network edge
23
Encode
Decode
Encode
Decode
Leverage
all RE
✖
Cache
Constraints
✔
Processing Constraints
✔
Cannot
leverage
Inter-path RE
Can
leverage
Intra-path RESlide24
How can we practically leverage the
benefits of network-wide RE optimally?
SmartRE
: Motivating question24
Edge
Hop-by-Hop
Leverage all RE✖
✔
Cache
Constraints
✔✖Processing Constraints
✔
✖Slide25
SmartRE: Key ideas
25
Don’t look at one-link-at-a-time –
Treat RE as a network-wide problem
Cache Constraints:
Routers coordinate caching; Each packet is cached only oncedownstream
Processing Constraints:Encode @ Ingress, Decode@ Interior/Egress; Decode can occur multiple hopsafter encoder
High Performance:
Network-Wide Optimization; Account for traffic, routing,
constraints etc.Slide26
Cache constraints
26
Packet arrivals:
A, B, A,B
Ingress can store 2pkts
Interior can store 1pkt
A,B
B,A
A,B
B
A
B
B
A
B
After 2
nd
pkt
After 4
th
pkt
Total RE savings in network footprint?
RE on first link
No RE on interior
2 * 1 = 2
Can we do better than this?Slide27
Coordinated caching
27
Packet arrivals:
A, B, A,B
Ingress can store 2pkts
Interior can store 1pkt
A,B
A,B
A,B
A
A
A
B
B
B
After 2
nd
pkt
1 * 2 + 1 * 3 = 5
RE for
pkt
A
Save 2 hops
RE for
pkt
B
Save 3 hops
After 4
th
pkt
Total RE savings in network footprint?Slide28
Processing constraints
28
Dec
Enc
Dec
Enc
Enc
Enc
Dec
Dec
Dec
4
Mem
Ops for Enc
2
Mem
Ops for Dec
5 Enc/
s
5 Dec/
s
5 Enc/
s
5 Dec/
s
5 Enc/
s
5 Enc/
s
5Dec/s
5Dec/s
Total RE savings in network footprint?
5 * 5 = 25 units/
s
Note that even though decoders can do more work, they are limited by encoders
20
Mem
Ops
Enc
5 Enc/
s
5 Dec/
s
Can we do better than this?Slide29
Coordinating processing
29
5 Dec/
s
4
Mem
Ops for Enc
2
Mem
Ops for Dec
5 Enc/
s
5 Enc/
s
10 Dec/
s
Total RE savings in network footprint?
10*3 + 5*2 = 40 units/
s
20
Mem
Ops
5 Dec/
s
5 Enc/
s
Dec @ edge
Dec @ core
Many nodes are idle. Still does better!
Good for
partial deployment
alsoSlide30
SmartRE system
30
Network-Wide Optimization
“Encoding
Configs
”
To Ingresses
“Decoding
Configs
”
To InteriorsSlide31
Ingress/Encoder Algorithm
31
Content store
Send compressed packet
Shim carries
Info(matched
pkt
)
MatchRegionSpec
Check if this
needs to be cached
Encoding
Config
Identify Candidate Packets to Encode
i.e., cached along path of new packet
MAXP/MODP to find maximal compressible regionsSlide32
Interior/Decoder Algorithm
32
Content store
Shim carries
Info(matched
pkt
)
MatchRegionSpec
Check if this
needs to be cached
Decoding
Config
Reconstruct compressed
regions
Send
uncompressed packetSlide33
Coordinating caching
33
Non-overlapping hash-
ranges per-path
avoids redundant
caching!
(from
cSamp
, NSDI 08)
[0.1,0.4]
[0.7,0.9
]
[0.7,0.9]
[0.1,0.4]
Hash (
pkt.header
)
Cache if hash in range
[0,0.3]
[0.1,0.3]
[0,0.1]
Hash (
pkt.header
)
Get path info for
pkt
Cache if hash in range for path
33Slide34
Network-wide optimization
34
Network-wide
optimization
Traffic PatternsTraffic Matrix
Redundancy Profile(intra + inter)
Router constraints Processing (MemAccesses)Cache Size
Encoding manifests
Decoding manifests
Objective:
Max. footprint reduction or any network objective (e.g., TE)?
Linear
Program
Inputs
Output
Topology
Routing Matrix
Topology Map
Path,
HashRange
Slide35
Cache consistency
35
[0.1,0.4]
[07,0.9]
[0.7,0.9]
[0.1,0.4]
[0,0.3]
[0.1,0.3]
[0,0.1]
What if
traffic surge on red path causes packets on black path to be evicted?
Create “logical buckets”
for every path-interior pair;
Evict
only within
bucketsSlide36
P1
P2
[0,0.2]
[0,0.1]
[0.2,0.5]
[0.1,0.4]
[0.5,0.7]
[0.4,0.6]
New
[0,0.7]
[0,0.6]
Cached
Always safe to encode
w.r.t
cached
pkts
on
same path
Cached
If cached in
routers common to P1 and P2
i.e., Hash
ε
[0,0.4]
Candidate packets must be available on this
path
Valid encodings
36Slide37
Network-Wide Optimization
@
NOC/
central controller
Routing
Redundancy Profile
TrafficDevice Constraints37
“Encoding
Configs
”
To Ingresses
“Decoding
Configs
”
To Interiors
[0.1,0.4]
[07,0.9]
[0.7,0.9]
[0.1,0.4]
[0,0.3]
[0.1,0.3]
[0,0.1]
Cache Consistency:
Create “logical buckets”
For every path-interior pair
Evict
only within
buckets
Non-overlapping hash-
ranges per-path to avoid
redundant
caching
Candidate packets
must be available on new packet’s pathSlide38
Results: Performance benchmarks
Network
#
PoPsTime(s)#RoutersTime(s)Level3630.53315
30Sprint520.47
26021Telstra440.29
2201738# MatchregionsRedundancyThroughputThroughput (w/o overhead)1
24%4.9Gbps8.7Gbps
232%
4.5Gbps7.9Gbps3
35%4.3Gbps7.7Gbps435%4.3Gbps7.6GbpsEncoding:2.2Gbps(w/o click
overhead)
Encoders
OC48; Decoders
OC192; Amenable to partial deployment
For faster links:Fraction of traffic left unprocessed, to be acted on by other encoders/decoders
not optimalSlide39
Network-wide benefits (ISPs)
39
SmartRE
is 4-5X better than the hop-by-hop approach
SmartRE
gets 80-90% of ideal unconstrained RE
Results consistent across redundancy profiles, on synthetic tracesSetup: Real traces from U. WiscEmulated over tier-1 ISP topologiesProcessing constraints MemOps & DRAM
speed2GB
cache per RE deviceSlide40
SmartRE: Other results
40
Can we benefit even with partial deployment?
Even simple strategies work pretty well!What if redundancy profiles change over time?
Some “dominant” patterns which are stable (for ISPs)Get good performance even with dated
configsSlide41
Summary and future directions
RE service to scale link capacities everywhere
Architectural niceties and performance benefits
Substantial redundancy in traffic; High speed router RE seems feasibleFuture directionsEnd-host participationRole of different memory technologies – DRAM, flash and PCMRole of RE (+ OpenFlow) in energy management TCP interactions with RE41Slide42
Architectural implications: Enhancement to routing
42Slide43
Network RE: Impact on routing
RE-aware optimization
E.g.: minimize network-wide “traffic footprint”
Traffic footprint on a link = latency * unique bytes on linkMin entropy routingOr, control utilization of expensive cross-country linksOr, route all redundant traffic on low-capacity links43
I1
R2
R3
R4
Network Operations Center
Redundancy
profile
TM +
Policy
Redundancy Profile
RE-aware routes
ISPSlide44
Network-wide utilization
Routed enterprise trace over Sprint topology (AS1239)
Average redundancy is
50%1GB cache per routerEach point reduction in network utilization with traffic entering at a given cityRE: 12-35% reductionRE + Routing: 20-45%
Effectiveness depends on topology and redundancy profile44
12-35%
20-45%Slide45
Responsiveness to flash crowds
Volume increases at one of the border routers
Redundancy: 20%
50%Inter-OD redundancy: 0.5 0.75Stale routes usedRE, RE + Routing absorb sudden changesStaleness has little impact
45Slide46
End-hosts
Network RE has limitations
Does not extend benefits
into end-hostsCrucial for last hop links such as cellular, wireless linksEnergy savings, along with bandwidth and latency Network-RE useless with encrypted trafficSome participation from end-hosts necessaryEnd-to-end RE (can be IP or higher-layer)Challenges and issuesCompression/decompression overhead on constrained end-pointsCo-existence with network-RE: end-hosts signaling what network should/should not store?46Slide47
Toward universal RE
47
Wan
Opt
Wan
Opt
Dedup
/
archival
Dedup
/
archival
Multicast
✗
Point solutions:
No benefits in the core
✗
Point solutions:
Other links must
re-implement specific
RE mechanisms
ISP HTTP
cache
Network Redundancy
Elimination Service
Multiple point solutions for RE today
Universal need to
scale capacities?
✗
Point solutions:
Only benefit
system/app
attached
RE: A primitive operation supported inherently in
the network
Applies to all links, flows, communication models
Transparent network service; optional end-point participation
Why?
How?Slide48
Enterprises
Traffic growth: Sustaining network performance
Network traffic growing rapidly
Annual growth: Enterprise (50%), backbone (45%), mobile settings (125%)
Strain on installed capacity: sustain network performance?
Core, enterprise/data center, last hop wireless links
A key idea: leverage
duplicationPopular objects, partial content matches, file backups, application headers,…Identify, remove data redundancies
48
ISPs
Mobile
users
Home users
Web
content
Data
centers
Video
Other
svcs
(backup)Slide49
Toward universal RE
49
Wan
Opt
Wan
Opt
Dedup
/
archival
Dedup
/
archival
Multicast
✗
Point solutions:
No benefits in the core
✗
Point solutions:
Other links must
re-implement specific
RE mechanisms
ISP HTTP
cache
Network Redundancy
Elimination Service
Ad hoc point deployments for RE at network edge
✗
Point solutions:
Only benefit
system/app
attached
RE: A primitive operation supported inherently in
the network
Applies to all links, flows (long/short), apps,
unicast
/multicast
Transparent network service; optional end-point participation
App, end-to-end, link performance should only improve
How? Implications?
Architectural support to address universal need to scale capacities? Implications?Slide50
Redundancy elimination (RE): Many solutions
Application-layer approaches
E.g., Web and P2P caches, proxies
Protocol-specific; miss sub-object redundancies; dynamic objectsProtocol-independent redundancy elimination eliminationOperate below app layer: remove duplicate bytes from any network flowE.g.: WAN optimizers, de-duplication or single-copy storageMuch more effective and general than app-layer techniquesOther approaches with similar goalsCDNs, peer-assisted download (e.g., Bittorrent), multicast protocolsClean-slate: Data Oriented Transport and Similarity Enhanced TransfersAll are
point solutions specific link, system, application or protocol
50Slide51
How? Ideas from WAN optimization
Network must examine byte streams, remove duplicates, reinsert
Building blocks from WAN optimizers: RE agnostic to application, ports or flow semantics
Upstream cache = content table + fingerprint indexFingerprint index: content-based names for chunks of bytes in payloadFingerprints computed for content, looked up to identify redundant byte-stringsDownstream cache: content table51
51
Cache
Cache
WAN link
Data center
EnterpriseSlide52
Growing traffic vs. network performance
Network traffic volumes growing rapidly
Annual growth: overall (45%), enterprise (50%), mobile (125%)*
Growing strain on installed capacity everywhereCore, enterprise access, data center, cellular, wireless…How to sustain robust network performance?
52
* Interview with Cisco CEO, Aug 2007, Network world
Enterprises
ISPs
Mobile
users
Home users
Video
Data
centers
Web content
Other
svcs
(backup)