/
SpreadSketch : Toward Invertible and Network-Wide Detection of SpreadSketch : Toward Invertible and Network-Wide Detection of

SpreadSketch : Toward Invertible and Network-Wide Detection of - PowerPoint Presentation

queenie
queenie . @queenie
Follow
66 views
Uploaded On 2023-08-25

SpreadSketch : Toward Invertible and Network-Wide Detection of - PPT Presentation

Superspreaders Lu Tang Qun Huang and Patrick P C Lee The Chinese University of Hong Kong Institute of Computing Technology CAS IEEE INFOCOM 2020 1 ID: 1014334

network sketch wide count sketch network count wide accuracy superspreader spread level distinct bucket source spreadsketch fast result min

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "SpreadSketch : Toward Invertible and Net..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. SpreadSketch: Toward Invertible and Network-Wide Detection of Superspreaders Lu Tang † , Qun Huang ‡, and Patrick P. C. Lee †† The Chinese University of Hong Kong‡ Institute of Computing Technology, CASIEEE INFOCOM 20201

2. Superspreader DetectionNetwork traffic: a stream of packets denoted by pairs: one or more source fields in the packet headere.g. = () or (, ): one or more destination fields in the headerSpread of : #(distinct y) connects toSuperspreaders: sources with large spreadsSame definition applies to destinations Detecting superspreaders in real time is critical DDoS attacks, port scanning, hot-spots 2

3. ChallengesNetwork-wide view The myriad connection of a superspreader can span the entire networkFast processing and detection speede.g. 10 Gb/s link: one packet every 67 nsMitigating malicious attacks requires fast recovery of superspreadersMemory efficiency Programmable switches: 1-2 MB per stage [Bosshart, SIGCOMM’13]Servers: tens of MB of SRAM3

4. SketchesSketches: summary data structures e.g., Count-Min [Cormode, 2005], Count Sketch [Charikar, 2002] Idea: project high-dimensional data into small subspace Count-Min [Cormode, 2005]: Update with a packetHash flow key to one counter per rowIncrement each hashed counterQuery a flowReturn the minimum among all hashed counters4Packet  +1+1+1+1Each element is a counter

5. SketchesGood: Memory efficiencyFast processing speedBad: Unable to count the number of distinct flowsCounters in Count-Min only count the frequency of each flow in the streamNon-invertibleCannot readily return all superspreaderse.g., Count-Min needs to enumerate all possible flows in entire flow key space to recover all superspreaders5Packet  +1+1+1+1

6. Our ContributionSpreadSketch, a fast and invertible sketch for network-wide superspreader detection in network data streamsFast and invertibleHigh processing speed, fast recovery of superspreadersCompact: small and static memory usageNetwork-wide: network-wide view of superspreadersTheoretical analysis on accuracy, space, and time complexityExtensive experiments on real-world network tracesHigher accuracy and performance over state-of-the-art sketches Feasibility on a Barefoot Tofino switch with resource efficiency6

7. ObservationsSpreadSketch: non-trivial extension of Count-MinBuild on two observationsObservation I: Highly skewed fan-out distributionsCAIDA19: top 1% sources account for 60% total spreadCAIDA18: top 10% account 67%CAIDA16: top 10% account 38%A superspreader dominates its hashed bucket in sketch w.h.p.7

8. ObservationsSpreadSketch: non-trivial extension of Count-MinBuild on two observationsObservation II: Rough fan-out estimation via hash strings Pattern appears with probability Level value: length of the most significant 0-bits provides a rough estimation for the number of distinct pairsSimilar ideas appear in Probabilistic Counting [Flajolet, 1985], HyperLogLog [Flajolet, 2007] 8() 0000101011100011  hash

9. Design – Main IdeaTrack the source in each bucket that dominates the bucket’s spreadBy observation I, a superspreader w.h.p. to be tracked in the sketchAchieve invertibility in sketchFind the source with highest spread by tracking highest level valueBy observation II, the source with the highest level has the highest spreadReplace integer counters in sketch with distinct countersEnable distinct counting in sketchUse multiresolution bitmap [Estan, IMC’03] 9

10. Design – Data StructureData structure: table of buckets 10   SpreadFlow keyLevel rows  buckets Bucket  : Distinct counter to track the total spread in the bucket: the candidate source key with the highest level in the bucket, the maximum level seen in the bucket 

11. UpdateInsert a source-destination pair into the sketchCalculate the level hash() = 00011011 = 3Mapto one bucket per rowFor each bucketInsert () to VCompare with LCase 1: L, do nothing 11Packet  After Key5LevelVBitmap Key5LevelV’BitmapBefore

12. UpdateInsert a source-destination pair into the sketchCalculate the level hash() = 00011011 = 3Mapto one bucket per rowFor each bucketInsert () to VCompare with LCase 1: L, do nothingCase 2: > L, copy to K, update L  12Packet  After Key1LevelVBitmap Key3LevelV’BitmapBefore

13. QueryGet the estimated spread of a source Locate all hashed bucket of Return the smallest value of the bitmap OptimizationCombine all by bitwise ANDReturn the value of the combined bitmap 13Source  

14. Superspreader DetectionIdea: consider all keys tracked by bucketsEnumerate all bucketsReport as a superspreader if its estimation exceeds the threshold 14  Get the estimated spread of  

15. Network-Wide DeploymentArchitecture: > 1 monitoring nodes and a centralized controllerMonitoring node: maintain and send the sketch to controllerController: merge all sketches and output superspreaders based on the merged sketchGoal: provide a network-wide view of superspreaders at controller  15 Sketch Sketch SketchController

16. Network-Wide SchemeMonitoring node: no assumption on the deploymente.g. we can deploy SpreadSketch on any end hosts or switchesController: construct a merged sketch16 Key3Level Bitmap Key5Level Bitmap Key5Level BitmapSketch1Sketch2Merged Sketch

17. Theoretical Analysis17Set SpreadSketch as a table of bucketsOn complexitySpace complexity: is flow key space is #bits in bitmap of each bucketPer-packet update time complexity: Recovery time complexity: is the number of superspreadersOn accuracyBounded errors 

18. Evaluation SetupTraces: three One-hour traces from CAIDAWe compare SpreadSketch with:18TraceUnique flowsUnique srcIPUnique dstIP#pktsCAIDA1697302345298410716529077193CAIDA18A174469433833240346228516482CAIDA18B69921081244441125363994819857Average size per minuteDistinct-Count Sketch (DCS) [Ganguly , ICDCS’07]RevSketch (REV) [Schweller, ToN’07]Connection Degree Sketch (CDS) [Wang, TIFS’11]Fast Sketch (FAST) [Liu, INFOCOM’12]Count-Min-Heap (CMH) [PODS’05]Vector Bloom Filter (VBF) [Liu, TIFS’16]

19. Result – Accuracy CAIDA1919SpreadSketch is more robust and accurate compared with state-of-the-art sketchesSimilar observations on CAIDA18 and CAIDA16 traces

20. Result – Speed20SpreadSketch (SS) achieves throughput more than 22 MPPSit is easily catch up with10 Gbs line speedSpreadSketch recoverys superspreaders within few millisecondsOverall, SpreadSketch achieves both high update and recovery speed

21. Result – Implementation on Hardware21We implement SpreadSketch in P4 and compile it in a Barefoot Tofino chipsetSpreadSketch consumes limited hardware resourceSpreadSketch processes packets at line-rate on a Tofino switchSwitch resources usage of SpreadSketch (percentages in brackets are fractions of total resource avaiable)

22. ConclusionSpreadSketch, an invertible sketch that enables fast and accurate network-wide superspreader detections in network data streams Contributions:Propose a new invertible sketch design to detect superspreadersHigh accuracy and robust on real-work tracesFast processing and recovery speedFeasibility on commodity hardware switchesDetailed theoretical analysis on both accuracy and complexitiesExtensive experiments on real-world tracesSource code: http://adslab.cse.cuhk.edu.hk/software/spreadsketch/22

23. Thank you!23

24. Backup Slides24††‡

25. Result – Accuracy on CAIDA1625SpreadSketch (SS)Only slightly lower than CDS maintains accuracy between 0.86-0.96

26. Result – Accuracy on CAIDA1826SpreadSketch(SS)achieves best accuracy on the moderate skewed trace

27. Result – Accuracy in Network-Wide27

28. Existing approaches Streaming methodsCompact Spread Estimator [Yoon, INFOCOM’09], Online Streaming Module [Zhao, IMC’05]Compact, however non-invertibleSampling methodsTwo-Phase Filtering [Bu, INFOCOM’09], Two Level Filtering [Venkataraman, NDSS’05]Unable to support accurate network-wide detectionSketch-based methods Count-Min-Heap [Cormode, PODS’05], FAST [Liu, TDASC’15], VBF [Liu, TIFS’16], CDS [Guan, GLOBECOM’09], RevSketch [NDSI’13]Large memory usage and high memory access overhead 28

29. We do not consider Two-Phase-Filtering in Two ReasonsFirst, it does not support network-wide detection29Recall drops below 0.65 even if using the maximum sampling rateReasons: 1. A superspreader can be very small in each monitoring point to escape from sampling2. Even if the superspreader is sampled in some of the points, the bias accumulated by the filtering makes its spread below the thresholdAccuracy on CAIDA16. Similar results on the other two traces

30. We do not consider Two-Phase-Filtering in Two ReasonsSecond, it does not support query for the spread of any given flowIt only keeps the spread information of large flowsComparing SSketch with TPF is similar to comparing Sketches with counter-based techniques in top-k problems. 30

31. The Choice of Distinct CountersThe field can be any distinct counter that satisfies Support intersection operationSupport union operationConsidering accuracy and efficiency, we use Multiresolution Bitmap [Estan, IMC’03] in our Spread-Sketch  31