/
Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of

Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of - PDF document

briana-ranney
briana-ranney . @briana-ranney
Follow
555 views
Uploaded On 2014-12-27

Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of - PPT Presentation

lustanfordedu Andrea Montanari Departments of EE and Stats Stanford University montanarstanfordedu Balaji Prabhakar Departments of EE and CS Stanford University balajistanfordedu Sarang Dharmapurikar Nuova Systems Inc San Jose California sarangnuovas ID: 30272

lustanfordedu Andrea Montanari Departments

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Counter Braids A Novel Counter Architect..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

CounterBraids:ANovelCounterArchitectureforPer-FlowMeasurementYiLuDepartmentofEEStanfordUniversityyi.lu@stanford.eduAndreaMontanariDepartmentsofEEandStatsStanfordUniversitymontanar@stanford.eduBalajiPrabhakarDepartmentsofEEandCSStanfordUniversitybalaji@stanford.eduSarangDharmapurikarNuovaSystems,IncSanJose,Californiasarang@nuovasystems.comAbdulKabbaniDepartmentofEEStanfordUniversityakabbani@stanford.eduABSTRACTFine-grainednetworkmeasurementrequiresroutersandswitchestoupdatelargearraysofcountersatveryhighlinkspeed(e.g.40Gbps).AnaivealgorithmneedsaninfeasibleamountofSRAMtostoreboththecountersanda\row-to-counterassociationrule,sothatarrivingpacketscanupdatecorrespondingcountersatlinkspeed.Thishasmadeaccu-rateper-\rowmeasurementcomplexandexpensive,andmo-tivatedapproximatemethodsthatdetectandmeasureonlythelarge\rows.Thispaperrevisitstheproblemofaccurateper-\rowmea-surement.Wepresentacounterarchitecture,calledCounterBraids,inspiredbysparserandomgraphcodes.Inanut-shell,CounterBraids\compresseswhilecounting".Itsolvesthecentralproblems(counterspaceand\row-to-counteras-sociation)ofper-\rowmeasurementby\braiding"ahierarchyofcounterswithrandomgraphs.Braidingresultsindrasticspacereductionbysharingcountersamong\rows;andus-ingrandomgraphsgeneratedon-the-\rywithhashfunctionsavoidsthestorageof\row-to-counterassociation.TheCounterBraidsarchitectureisoptimal(albeitwithacomplexdecoder)asitachievesthemaximumcompressionrateasymptotically.Forimplementation,wepresentalow-complexitymessagepassingdecodingalgorithm,whichcanrecover\rowsizeswithessentiallyzeroerror.EvaluationonInternettracesdemonstratesthatalmostall\rowsizesarerecoveredexactlywithonlyafewbitsofcounterspaceper\row.CategoriesandSubjectDescriptorsC.2.3[ComputerCommunicationNetworks]:NetworkOperations-NetworkMonitoring;E.1[DataStructures]Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.SIGMETRICS'08,June2–6,2008,Annapolis,Maryland,USA.Copyright2008ACM978-1-60558-005-0/08/06...$5.00.GeneralTermsMeasurement,Algorithms,Theory,PerformanceKeywordsStatisticsCounters,NetworkMeasurement,MessagePass-ingAlgorithms1.INTRODUCTIONThereisanincreasingneedfor ne-grainednetworkmea-surementtoaidthemanagementoflargenetworks[14].Net-workmeasurementconsistsofcountingthesizeofalogicalentitycalled\\row",ataninterfacesuchasarouter.A\rowisasequenceofpacketsthatsatisfyacommonsetofrules.Forinstance,packetswiththesamesource(destination)ad-dressconstitutea\row.Measuring\rowsofthistypegivesthevolumeofupload(download)byauserandisusefulforaccountingandbillingpurposes.Measuring\rowswithaspeci c\row5-tupleinthepacketheadergivesmorede-tailedinformationsuchasroutingdistributionandtypesoftracinthenetwork.Suchinformationcanhelpgreatlywithtracengineeringandbandwidthprovisioning.Flowscanalsobede nedbypacketclassi cation.Forexample,ICMPEchopacketsusedfornetworkattacksforma\row.Measuringsuch\rowsisusefulduringandafteranattackforanomalydetectionandnetworkforensics.Currentlythereexistsnolarge-scalestatisticscounterar-chitecturethatisbothcheapandaccurate.Thisismainlyduetothelackofa ordablehigh-densityhigh-bandwidthmemorydevices.Toillustratetheproblem,theprocessingtimefora64-bytepacketata40-GbpsOC-768linkis12ns.ThisrequiresmemorieswithaccesstimemuchsmallerthanthatofcommerciallyavailableDRAM(whoseaccesstimeistensofnsec),andmakesitnecessarytoemploySRAMs.However,duetotheirlowdensity,largeSRAMsareexpen-siveanddiculttoimplementon-chip.Itis,therefore,es-sentialto ndacounterarchitecturethatminimizesmemoryspace.Therearetwomaincomponentsofthetotalspacere-quirement:1.Counterspace.Assumingthatamilliondistinct\rowsareobservedinaninterval1andusingone64-bitcounter 1OurOC-48(2:5Gbps)tracedatashowthatareabout900;000distinct\row5-tuplesina5-minuteinterval.On40-Gbpslinks,therecaneasilybeanexcessofamilliondis- per\row(astandardvendorpractice[20]),8MBofSRAMisneededforcounterspacealone.2.Flow-to-counterassociationrule.Thesetofactive\rowsvariesovertime,andthe\row-to-counterassociationruleneedstobedynamicallyconstructed.Forasmallnum-berof\rows,acontent-addressable-memory(CAM)isusedinmostapplications.However,thehighpowerconsumptionandheatdissipationofCAMsforbidtheiruseinrealisticscenarios,andSRAMhashtablesareusedtostorethe\row-to-counterassociationrule.Thisrequiresatleastanother10MBofSRAM.Thelargespacerequirementnotonlyconsiderablyin-creasesthecostoflinecards,butalsohindersacompactlayoutofchipsduetothelowdensityofSRAM.1.1PreviousApproachesThewideapplicabilityandinherentdicultyofdesign-ingstatisticscountershaveattractedtheattentionoftheresearchcommunity.Therearetwomainapproaches:(i)ExactcountingusingahybridSRAM-DRAMarchitecture,and(ii)approximatecountingbyexploitingtheheavy-tailnatureof\rowsizedistribution.Wereviewtheseapproachesbelow.Exactcounting.Shahet.al.[22]proposedandanalyzedahybridarchitecture,takingthe rststeptowardsanim-plementablelarge-scalecounterarray.Thearchitecturecon-sistsofshallowcountersinfastSRAManddeepcountersinslowDRAM.Thechallengeisto ndasimplealgorithmforupdatingtheDRAMcounterssothatnoSRAMcounterover\rowsinbetweentwoDRAMupdates.Thealgorithmanalyzedin[22]wassubsequentlyimprovedbyRamabhad-ranandVarghese[20]andZhaoet.al.[23].Thisreducedthealgorithmcomplexity,makingitfeasibletouseasmallSRAMwith5bitsper\rowtocount\rowsizesinpackets(notbytes).However,allthepapersabovesu erfromthefollowingdrawbacks:(i)deep(typically64bitsper\row)o -chipDRAMcountersareneeded,(ii)costlySRAM-to-DRAMupdatesarerequired,and(iii)the\row-to-counterassociationproblemisassumedtobesolvedusingaCAMorahashtable.Inparticular,theydonotaddressthe\row-to-counterassociationproblem.Approximatecounting.Tokeepcostacceptable,prac-ticalsolutionsfromtheindustryandacademicresearchei-thersacri cetheaccuracyorlimitthescopeofmeasure-ment.Forexample,Cisco'sNet\row[1]countsboth5-tuplesandper-pre x\rowsbasedonsampling,whichintroducesasigni cant9%relativeerrorevenforlarge\rowsandmoreerrorsforsmaller\rows[12].JuniperNetworksintroduced lter-basedaccounting[2]tocountalimitedsetof\rowspre-de nedmanuallybyoperators.The\sample-and-hold"solu-tionproposedbyEstanandVarghesein[12],whileachievinghighaccuracy,measuresonly\rowsthatoccupymorethan0:1%ofthetotalbandwidth.EstanandVarghese'sapproachintroducedtheideaofexploitingtheheavy-tail\rowsizedis-tribution:sinceafewlarge\rowsbringmostofthedata,itisfeasibletoquicklyidentifytheselarge\rowsandmeasuretheirsizesonly. tinct\row5-tuplesinashortobservationinterval.Or,formeasuringthefrequencyofpre xaccesses,oneneedsabout500;000counters,whichisthecurrentsizeofIPv4routingtables[20].Futureroutersmayeasilysupportmorethanamillionpre xes.1.2OurApproachThemaincontributionofthispaperisanSRAM-onlylarge-scalecounterarchitecturewiththefollowingfeatures:1.Flow-to-counterassociationusingasmallnumber(e.g.3)ofhashfunctions.2.Incrementalcompressionof\rowsizesaspacketsarrive;onlyasmallnumber(e.g.3)ofcountersareaccessedateachpacketarrival.3.Asymptoticoptimality.Wehaveprovedin[17]thatCounterBraids(CB),withanoptimal(butNP-hard)decoder,hasanasymptoticcompressionratematchingtheinformationtheoreticlimit.TheresultissurprisingsinceCBformsarestrictivefamilyofcompressors.4.Alinear-complexitymessagepassingdecodingalgo-rithmthatrecoversall\rowsizesfromcompressedcountswithessentiallyzeroerror.TotalspaceinCBneededforexactrecoveryisclosetotheoptimalcompressionof\rowsizes.5.Themessagepassingalgorithmisanalyzable,enablingthechoiceofdesignparametersfordi erenthardwarerequirement.Remark:WenotethatCBhasthedisadvantageofnotsupportinginstantaneousqueriesof\rowsizes.All\rowsizesaredecodedtogetherattheendofameasurementepoch.Weplantoaddressthisprobleminfuturework.Informaldescription.CounterBraidsisahierarchyofcountersbraidedviarandomgraphsintandem.Figure1(a)showsanaivecounterarchitecturethatstores ve\rowsizesincountersofequaldepth,whichhastoexceedthesizeofthelargest\row.Eachbitinacounterisshownasacircle.Theleastsigni cantbit(LSB)istheoneclosesttothe\rownode.Filledcirclesrepresenta1,andun lledcirclesa0.Thisstructureleadstoanenormouswastageofspacebecausethemajorityof\rowsaresmall.Figure1(b)showsCBforstoringthesame\rowsizes.Itisworthnotingthat:(i)CBhasfewer\moresigni cantbits"andtheyaresharedamongall\rows,and(ii)theexact\rowsizescanbeobtainedby\decoding"thebitpattenstoredinCB.Acomparisonofthetwo guresclearlyshowsagreatreductioninspace.1.3RelatedTheoreticalLiteratureCompressedSensing.TheideaofCounterBraidsisthe-maticallyrelatedtocompressedsensing[6,11],whosecentralinnovationissummarizedbythefollowingquote:Sincewecan\throwaway"mostofourdataandstillbeabletoreconstructtheoriginalwithnoperceptualloss(aswedowithubiquitoussound,imageanddatacompressionformats,)whycan'twedirectlymeasurethepartthatwillnotendupbeing\thrownaway"?[11]Forthenetworkmeasurementproblem,weobtainavec-torofcountervalues,c,viaCB,fromthe\rowsizesf.Iffhasasmallentropy,thevectorcoccupiesmuchlessspacethanf;itconstitutes\thepart(off)thatwillnotendupbeingthrownaway."Ano -chipdecodingalgorithmthenrecoversffromc.WhileCompressedSensingandCBare 2351 (a) 351321 (b)Figure1:(a)Asimplecounterstructure.(b)CounterBraids.( lledcircle=1,un lledcircle=0).thematicallyrelated,theyaremethodologicallyquitedif-ferent:CompressedSensingcomputesrandomlineartrans-formationsofthedataandusesLP(linearprogramming)reconstructionmethods;whereasCBusesamulti-layerednon-linearstructureandamessagepassingreconstructionalgorithm.Sparserandomgraphcodes.CounterBraidsismethod-ologicallyinspiredbythetheoryoflow-densityparitycheck(LDPC)codes[13,21].SeealsorelatedliteraturesonTor-nadocodes[18]andFountaincodes[4].Fromtheinforma-tiontheoreticperspective,thedesignofanecientcount-ingschemeandagood\rowsizeestimationisequivalenttothedesignofanecientcompressor,orasourcecode[8].However,thenetworkmeasurementproblemimposesastringentconstraintonsuchacode:eachtimethesizeofa\rowchanges(becauseanewpacketarrives),asmallnumberofoperationsmustbesucienttoupdatethecompressedin-formation.Thisisnotthecasewithstandardsourcecodes(suchastheLempel-Zivalgorithm),wherechangingasin-glebitinthesourcestreammaycompletelyalterthecom-pressedversion.We ndthattheclassofsourcecodesdualtoLDPCcodes[5]workwellunderthisconstraint;usingfeaturesofthesecodesmakesCBagood\incrementalcom-pressor."ThereisaprobleminusingthedesignofLDPCcodesfornetworkmeasurement:withtheheavy-taileddistribution,the\rowsizesareaprioriunbounded.Inthechannelcodinglanguage,thisisequivalenttousingacountablebutin niteinputalphabet.Asaresult,newideasaredevelopedforprovingtheachievabilityofoptimalasymptoticcompressionrate.Thefullproofiscontainedin[17]andwestatethetheoremintheappendixforcompleteness.Thelargealphabetsizealsomakesiterativemessagepass-ingdecodingalgorithms[15],suchasBeliefPropagation,highlycomplextoimplement,asBPpassesprobabilitiesratherthannumbers.Inthispaper,wepresentanovelmes-sagepassingdecodingalgorithmoflowcomplexitythatiseasytoimplement.Thesub-optimalityofthemessagepass-ingalgorithmnaturallyrequiresmorecounterspacethantheinformationtheoreticlimit.Wecharacterizethemini-mumspacerequiredforzeroasymptoticdecodingerrorus-ing\densityevolution"[21].ThespacerequirementcanbefurtheroptimizedwithrespecttothenumberoflayersinCounterBraids,andthedegreedistributionofeachlayer.Theoptimizedspaceisclosetotheinformationtheoreticlimit,enablingCBto tintosmallSRAM.Count-MinSketch.LikeCounterBraids,theCount-Minsketch[7]fordatastreamapplicationsisalsoarandomhash-basedstructure.WithCount-Min,each\rowhashestoandupdatesdcounters;theminimumvalueofthedcountersisretrievedasthe\rowestimate.TheCount-Minsketchprovidesprobabilisticguaranteesfortheestimationerror:withatleast1probability,theestimationerrorislessthanjfj1,wherejfj1isthesumofall\rowsizes.Tohavesmalland,thenumberofcountersneedstobelarge.TheCount-Minsketchisdi erentfromCounterBraidsinthefollowingways:(a)Thereisno\braiding"ofcounters,hencenocompression.(b)TheestimationalgorithmfortheCount-Minsketchisone-step,whereasitisiterativeforCB.Infact,comparingtheCount-Minalgorithmtoourrecon-structionalgorithmonaone-layerCB,itiseasytoseethattheestimatebyCount-Minisexactlytheestimateafterthe rstiterationofouralgorithm.Thus,CBperformsatleastaswellastheCount-Minalgorithm.2(c)Ourreconstruc-tionalgorithmdetectserrors.Thatis,itcandistinguishthe\rowswhosesizesareincorrectlyestimated,andproduceanupperandlowerboundofthetruevalue;whereastheCount-Minsketchonlyguaranteesanover-estimate.(d)CBneedstodecodeallthe\rowsizesatonce,unliketheCount-Minalgorithmwhichcanestimateasingle\rowsize.Thus,Count-MinisbetterathandlingonlinequeriesthanCB.StructurallyrelatedtoCounterBraids(randomhashingof\rowsintocountersandarecoveryalgorithm)istheworkofKumaret.al.[16].Thegoalofthatworkistoestimatethe\rowsizedistributionandnottheactual\rowsizes,whichisouraim.InSection2,wede nethegoalsofthispaperandoutlineoursolutionmethodology.Section3describestheCounterBraidsarchitecture.Themessagepassingdecodingalgo-rithmisdescribedinSection4andanalyzedinSection5.Section6exploresthechoiceofparametersformulti-layerCB.ThealgorithmisevaluatedusingtracesinSection7.WediscussimplementationissuesinSection8andoutlinefurtherworkinSection9.2.PROBLEMFORMULATIONWedividetimeintomeasurementepochs(e.g.5minutes).Theobjectiveistocountthenumberofpacketsper\rowforallactive\rowswithinameasurementepoch.Wedonotdealwiththebyte-countingprobleminthispaperduetospacelimitation,butthereisnoconstraintinusingCounterBraidsforbyte-counting.Goals:AsmentionedinSection1,themainproblemswewishtoaddressare:(i)compacting(oreliminating)thespaceusedby\row-to-counterassociationrule,and(ii)sav-ingcounterspaceandincrementallycompressingthecounts. 2Thisissimilartothebene tofTurbocodesoverconven-tionalsoft-decisiondecodingalgorithmsandillustratesthepowerofthe\Turboprinciple." Additionally,wewouldlike(iii)alow-complexityalgorithmtoreconstruct\rowsizesattheendofameasurementepoch.Solutionmethodology:Correspondingtothegoals,we(i)useasmallnumberofhashfunctions,(ii)braidthecoun-ters,and(iii)usealinear-complexitymessage-passingalgo-rithmtoreconstruct\rowsizes.Inparticular,byusingasmallnumberofhashfunctions,weeliminatetheneedforstoringa\row-to-counterassociationrule.Performancemeasures:(1)Space:measuredinnumberofbitsper\rowoccupiedbycounters.Wedenoteitbyr(tosuggestcompressionrateasintheinformationtheoryliterature.)Notethatthenumberofcountersisnotthecorrectmeasureofcompressionrate;rather,itisthenumberofbits.(2)Reconstructionerror:measuredasthefractionof\rowswhosereconstructedvalueisdi erentfromthetruevalue:Perr1 nnXi=1Ifbfi=fig;wherenisthetotalnumberof\rows,bfiistheestimatedsizeof\rowiandfithetruesize.Iistheindicatorfunc-tion,whichreturns1iftheexpressioninthebracketistrueand0otherwise.Wechosethismetricsincewewantexactreconstruction.(3)Averageerrormagnitude:de nedastheratioofthesumofabsoluteerrorsandthenumberoferrors:Em=Pijfibfij PiI(fi=bfi):Itmeasureshowbiganerroriswhenanerrorhasoccurred.Thestatementofasymptoticoptimalityintheappendixyieldsthatitispossibletokeepspaceequaltothe\row-sizeentropy,andhavereconstructionerrorgoingto0asthenumberof\rowsgoestoin nity.Bothanalysis(Section5)andsimulations(Section7)showthatwithourlow-complexitymessagepassingdecodingal-gorithm,wecankeepspaceclosetothe\row-sizeentropyandobtainessentiallyzeroreconstructionerror.Inaddi-tion,thealgorithmo ersagraciousdegradationoferrorwhenspaceisfurtherreduced,evenbelowthe\row-sizeen-tropy.Althoughreconstructionerrorbecomessigni cant,averageerrormagnituderemainssmall,whichmeansthatmost\row-sizeestimatesareclosetotheirtruevalues.3.OURSOLUTIONTheoverallarchitectureofoursolutionisshowninFigure2.EacharrivingpacketupdatesCounterBraidsinon-chipSRAM.Thisconstitutestheencodingstageifweviewmea-surementascompression.Attheendofameasurementepoch,thecontentofCounterBraids,i.e.,thecompressedcounts,aretransferredtoanoineprocessingunit,suchasaPC.Areconstructionalgorithmthenrecoversthelistof\rowID,size&#x-3.6;⑹pairs.WedescribeCBinSection3.1andspecifythemappingthatsolvesthe\row-to-counterassociationprobleminSec-tion3.2.Wedescribetheupdatingscheme,ortheon-chipencodingalgorithm,inSection3.3,leavingthedescriptionofthereconstructionalgorithmtoSection4. Figure2:SystemDiagram.3.1CounterBraidsCounterBraidshasalayeredstructure.Thel-thlayerhasmlcounterswithadepthofdlbits.LetthetotalnumberoflayersbeL.Inpractice,L=2isusuallysucientaswillbeshowninSection6.Figure3illustratesthecasewhereL=2.Foracompletedescriptionofthestructure,weleaveLasaparameter. Figure3:Two-layerCounterBraidswithtwohashfunc-tionsandstatusbits.WewillshowinlatersectionsthatwecanuseadecreasingnumberofcountersineachlayerofCB,andstillbeabletorecoverthe\rowsizescorrectly.Theideaisthatgivenaheavy-taildistributionfor\rowsizes,themoresigni cantbitsinthecountersarepoorlyutilized;sincebraidingallowsmoresigni cantbitstobesharedamongall\rows,areducednumberofcountersinthehigherlayerssuce.Figure3alsoshowsanoptionalfeatureofCB,thestatusbits.Astatusbitisanadditionalbitona rst-layercounter.Itissetto1afterthecorrespondingcounter rstover\rows.CounterBraidswithoutstatusbitsistheoreticallysucient:theasymptoticoptimalityresultintheappendixisshownwithoutstatusbits,assumingahigh-complexityoptimalde-coder.However,inpracticeweusealow-complexitymes-sagepassingdecoder,andtheparticularshapeofthenet-worktracdistributionisbetterexploitedwithstatusbits.Statusbitsoccupyadditionalspace,butprovideusefulin-formationtothemessage-passingdecodersothatthenum-berofsecond-layercounterscanbefurtherreduced,yield-ingafavorabletradeo inspace.Statusbitsaretakenintoaccountwhencomputingthetotalspace;inparticular,it guresintheperformancemeasure,r,\spaceinnumberof bitsper\row."InCBwithmorethantwolayers,everylayerexceptthelastwillhavecounterswithstatusbits.3.2TheRandom(Hash)MappingsWeusethesamerandommappingintwosettings:(i)between\rowsandthe rst-layercounters,and(ii)betweentwoconsecutivelayersofcounters.ThedashedarrowsinFigure3illustrateboth(i)and(ii)(whichisbetweenthe rstandsecondlayercounters.)Considertherandommappingbetween\rowsandthelayer-1counters.Foreach\rowID,weapplykpseudo-randomhashfunctionswithacommonrangef0;;m11g,wherem1isthenumberofcountersinlayer1,asillustratedinFig-ure3(withk=2.)Themappinghasthefollowingfeatures:1.Itisdynamicallyconstructedforavaryingsetofac-tive\rows,byapplyinghashfunctionsto\rowIDs.Inotherwords,nomemoryspaceisneededtodescribethemappingexplicitly.Thestorageforthe\row-to-counterassociationissimplythesizeofdescriptionofthekhashfunctionsanddoesnotincreasewiththenum-berof\rowsn.2.Thenumberofhashfunctionskissettoasmallcon-stant(e.g.3).Thisallowscounterstobeupdatedwithonlyasmallnumberofoperationsatapacketarrival.Remark.Notethatthemappingdoesnothaveanyspecialstructure.Inparticular,itisnotbijective.Thisnecessi-tatestheuseofareconstructionalgorithmtorecoverthe\rowsizes.Usingk�1addsredundancytothemappingandmakesrecoverypossible.However,therandommappingdoesmorethansimplifyingthe\row-to-counterassociation.Infact,itperformsthecompressionof\rowsizesintocountervaluesandreducescounterspace.Nextconsidertherandommappingbetweentwoconsec-utivelayersofcounters.Foreachcounterlocation(intherangef0;;ml1g)inthel-thlayer,weapplykhashfunctionstoobtainthecorresponding(l+1)-thlayercounterlocations(intherangef0;;ml+11g).ItisillustratedinFigure3withk=2.Theuseofhashfunctionsenablesustoimplementthemappingwithoutextracircuitsinthehardware;andtherandommappingfurthercompressesthecountsinlayer-2counters.3.3Encoding:TheUpdatingAlgorithmTheinitializationandupdateproceduresofatwo-layerCounterBraidswith2hashfunctionsateachlayerarespec-i edinExhibit1.Theproceduresincludeboththegener-ationofrandommappingusinghashfunctionsandtheup-datingscheme.Whenapacketarrives,bothcountersits\rowlabelhashesintoareincremented.Andwhenacounterinlayer1over\rows,bothcountersinlayer2ithashesintoareincrementedby1,likeacarry-over.Theover\rowingcounterisresetto0andthecorrespondingstatusbitissetto1.Itisevidentfromtheexhibitthattheamountofupdat-ingrequiredisverysmall.Yetaftereachupdate,thecoun-tersstoreacompressedversionofthemostup-to-date\rowsizes.Theincrementalnatureofthiscompressionalgorithmismadepossiblewiththeuseofrandomsparselinearcodes,whichweshallfurtherexploitatthereconstructionstage. Exhibit1:TheUpdateAlgorithm 1:Initialize2:forlayerl=1to23:forcounteri=1toml4:counters[l][i]=05:Update6:Uponthearrivalofapacketpkt7:idx1=hash-function1(pkt);8:idx2=hash-function2(pkt);9:counters[1][idx1]=counter[1][idx1]+1;10:counters[1][idx2]=counter[1][idx2]+1;11:ifcounters[1][idx1]over\rows,12:Updatesecond-layercounters(idx1);13:ifcounters[1][idx2]over\rows,14:Updatesecond-layercounters(idx2)15:Updatesecond-layercounters(idx)16:statusbit[1][idx]=1;17:idx3=hash-function3(idx);18:idx4=hash-function4(idx);19:counters[2][idx3]=counter[2][idx3]+1;20:counters[2][idx4]=counter[2][idx4]+1 Theupdateofthesecond-layercounterscanbepipelined.Itcanbeexecutedtogetherwiththenextupdateofthe rst-layercounters.Ingeneral,pipeliningcanbeusedforCBwithmultiplelayers. Figure4:Atoyexampleforupdating.Numbersnextto\rownodesarecurrent\rowsizes.Dottedlinesindi-catehashfunctions.Thicklinesindicatehashfunctionsbeingcomputedbyanarrivingpacket.The\rowwithanarrivingpacketisindicatedbyanarrow.Figure4illustratestheupdatingalgorithmwithatoyex-ample.(a)showstheinitialstateofCBwithtwo\rows.In(b),anew\rowarrives,bringingthe rstpacket;alayer-1counterover\rowsandupdatestwolayer-2counters.In(c),apacketofanexisting\rowarrivesandnoover\rowoccurs.In(d),anotherpacketofanexisting\rowarrivesandanotherlayer-1counterover\rows. 4.MESSAGEPASSINGDECODERThesparsityoftherandomgraphs3inCBopensthewaytousinglow-complexitymessagepassingalgorithmsforre-constructionof\rowsizes,butthedesignofsuchanalgo-rithmisnotobvious.InthecaseofLDPCcodes,messagepassingdecodingalgorithmsholdthepromiseofapproach-ingcapacitywithunprecedentedlylowcomplexity.However,thealgorithmsusedincoding,suchasBeliefPropagation,haveincreasingmemoryrequirementasthealphabetsizegrows,sinceBPpassesprobabilitydistributionsinsteadofsinglenumbers.Wedevelopanovelmessagepassingalgo-rithmthatissimpletoimplementoncountablealphabets.4.1OneLayerConsidertherandommappingbetween\rowsandthe rst-layercounters.Itisabipartitegraphwith\rownodesontheleftandcounternodesontheright,asshowninFigure5.Anedgeconnects\rowiandcounteraifoneofthekhashfunctionsmaps\rowitocountera.Thevectorfdenotes\rowsizesandcdenotescountervalues.ca=Xi2@afi;where@adenotesallthe\rowsthathashintocountera.Theproblemistoestimatefgivenc. Figure5:Messagepassingonabipartitegraphwith\rownodes(circles)andcounternodes(rectangles.)Messagepassingalgorithmsareiterative.Inthetthiter-ationmessagesarepassedfromall\rownodestoallcounternodesandthenbackinthereversedirection.Amessagegoesfrom\rowitocountera(denotedbyia)andviceversa(de-notedbyai)onlyifnodesiandaareneighbors(connectedbyanedge)onthebipartitegraph.OuralgorithmisdescribedinExhibit2.Themessagesia(0)areinitializedto0,althoughanyinitialvaluelessthantheminimum\rowsize,min,willworkjustaswell.Theinterpretationofthemessagesisasfollows:aiconveyscountera'sguessof\rowi'ssizebasedontheinformationitreceivedfromneighboring\rowsotherthan\rowi.Con-versely,iaistheguessby\rowiofitsownsize,basedontheinformationitreceivedfromneighboringcountersotherthancountera.Remark1.Sinceia(0)=0,ai(1)=caandbfi(1)=minafcag; 3EachrandommappinginCBisarandombipartitegraphwithedgesgeneratedbythekhashfunctions.Itissparsebecausethenumberofedgesislinearinthenumberofnodes,asopposedtoquadraticforacompletebipartitegraph. Exhibit2:TheMessagePassingDecodingAlgorithm 1:Initialize2:min=minimum\rowsize;3:ia(0)=08iand8a;4:ca=athcountervalue5:Iterations6:foriterationnumbert=1toT7:ai(t)=maxncaPj=ija(t1);mino;8:ia(t)=minb=abi(t)iftisodd,maxb=abi(t)iftiseven.9:FinalEstimate10:bfi(T)=minafai(T)gifTisodd,maxafai(T)gifTiseven. Figure6:Thedecodingalgorithmover4iterations.Numbersinthetopmost gurearetrue\rowsizesandcountervalues.Inaniteration,numbersnexttoanodearemessagesonitsoutgoingedges,fromtoptobot-tom.Eachiterationinvolvesmessagesgoingfrom\rowstocountersandbackfromcountersto\rows.whichispreciselytheestimateoftheCount-Minalgorithm.Thus,theestimateofCount-Ministheestimateofourmessage-passingalgorithmafterthe rstiteration.Remark2.Thedistinctionbetweenoddandevenitera-tionsatline8and10isduetothe\anti-monotonicityprop-erty"ofthemessage-passingalgorithm,tobediscussedinSection5.Remark3.Itturnsoutthatthealgorithmremainsun-changediftheminimumormaximumatline8isoverallincomingmessages,thatis,ia(t)=minbbi(t)iftisodd,maxbbi(t)iftiseven.Thechangewillsavesomecomputationsinimplementation.Theproofofthisfactandensuinganalyticalconsequencesisdeferredtoforthcomingpublications.Inthispaper,westicktothealgorithminExhibit2. Toyexample.Figure6showstheevolutionofmessagesover4iterationsonatoyexample.Inthisparticularexam-ple,all\rowsizesarereconstructedcorrectly.Notethatweareusingdi erentdegreesatsome\rownodes.Ingeneral,thisgivespotentiallybetterperformancethanall\rownodeshavingthesamedegree,butwewillsticktothelatterinthispaperforitseaseofimplementation.The\rowestimatesateachiterationarelistedinTable1.Allmessagesconvergein4iterationsandtheestimatesatIteration1(secondcolumn)istheCount-Minestimate.iteration 01234 bf1 034111 bf2 034111 bf3 032323232Table1:Flowestimatesateachiteration.AllmessagesconvergeafterIteration3.4.2Multi-layerMulti-layerCounterBraidsaredecodedrecursively,onelayeratatime.Itisconceptuallyhelpfultoconstructanewsetof\rowsflforlayer-lcountersbasedonthecountervaluesatlayer(l1).Thepresenceofstatusbitsa ectsthede nitionoffl. Figure7:Withoutstatusbits,\rowsinf2haveaone-to-onemaptoallcounterinc1. Figure8:Withstatusbits,\rowsinf2haveaone-to-onemaptoonlycountersthathaveover\rown(whosestatusbitsaresetto1).Figure7illustratestheconstructionoff2whentherearenostatusbits.Thevectorf2hasaone-to-onemaptocoun-tersinlayer1,anda\rowsizeinf2equalsthenumberoftimesthecorrespondingcounterhasover\rown,withtheminimumvalue0.Figure8illustratestheconstructionoff2whentherearestatusbits.Thevectorf2nowhasaone-to-onecorrespon-dencewithonlythosecountersinlayer1thathaveover-\rown;thatis,counterswhosestatusbitsaresetto1.Thenew\rowsizeisstillthenumberoftimesthecorrespondingcounterover\rows,butinthiscase,theminimumvalueis1.Itisclearfromthe gurethattheuseofstatusbitse ec-tivelyreducesthenumberof\rownodesinlayer2.Hence,fewercountersareneededinlayer2forgooddecodability.Thisreductionincounterspaceatlayer2tradeso withtheadditionalspaceneededforstatusbitsthemselves!AsweshallseeinSection6,whenthenumberoflayersinCBissmall,thetradeo favorstheuseofstatusbits.The\rowsizesaredecodedrecursively,startingfromthetopmostlayer.Forexample,afterdecodingthelayer-2\\rows,"weaddtheirsizes(theamountofover\rowfromlayer-1coun-ters)tothevaluesoflayer-1counters.Wethenusethenewvaluesoflayer-1counterstodecodethe\rowsizes.DetailsofthealgorithmarepresentedinExhibit3. Exhibit3:TheMulti-layerAlgorithm 1:forl=Lto12:constructthegraphforlthlayerasinFigure7ifwithoutstatusbits;asinFigure8ifwithstatusbits;3:decodeflfromclasinExhibit24:cl1=cl1+fl2l1wheredl1isthecounterdepthinbitsatlayer(l1) 5.SINGLE-LAYERANALYSISThedecodingalgorithmworksonelayeratatime;hence,we rstanalyzethesingle-layermessagepassingalgorithmanddetermineitsraterandreconstructionerrorprobabilityPerr.Thisanalysislaysthefoundationforthedesignofmulti-layerCounterBraids,tobepresentedinSection6.Sinceallcountersinlayer1havethesamedepthd1,averyrelevantquantityfortheanalysisisthenumberofcountersper\row: m=n;wheremisthenumberofcountersandnisthenumberof\rows.Thecompressionrateinbitsper\rowisgivenbyr= d1.ThebipartitegraphinFigure5willbethefocusofstudy,asitspropertiesdeterminetheperformanceofthealgorithm.Lemma1.TogglingProperty.Ifia(t1)fiforeveryianda,thenai(t)fiandia(t)fi.Conversely,ifia(t1)fiforeveryianda,thenai(t)fiandia(t)fi.Theproofofthislemmafollowssimplyfromthede nitionofandandisomitted.Lemma2.Anti-monotonicityProperty.Ifand0aresuchthatforeveryianda,ia(t1)0ia(t1)fi,thenia(t)0ia(t)fi.Consequently,sincebf(0)=0,bf(2t)fcomponent-wiseandbf(2t)iscomponent-wisenon-decreasing.Similarlybf(2t+1)fandiscomponent-wisenon-increasing. Proof.Itfollowsfromline7ofExhibit2that,ifia(t1)0ia(t1)fi,thenai(t)0ai(t)fi.4Fromthisandthede nitionsofandbfatlines8and10ofExhibit2,therestofthelemmafollows. Theabovelemmasgiveapowerfulconclusion:Thetruevalueofthe\row-sizevectorissandwichedbetweenmonoton-icallyincreasinglowerboundsandmonotonicallydecreasingupperbounds.Thequestion,therefore,is:Convergence:Whendoesthesandwichclose?Thatis,underwhatconditionsdoesthemessagepassingalgorithmconverge?Wegivetwoanswers.The rstisgeneral,notrequiringanyknowledgeofthe\row-sizedistribution.Thesecondusesthe\row-sizedistribution,butgivesamuchbetteranswer.Indeed,oneobtainsanexactthresholdfortheconvergenceofthealgorithm:For � thealgorithmconverges,andfor itfailstoconverge(i.e.thesandwichdoesnotclose.)5.1MessagePassingonTreesDefinition1.Agraphisaforestifforallnodesinthegraph,thereexistsnopathofnon-vanishinglengththatstartsandendsatthesamenode.Inotherwords,thegraphcon-tainsnoloops.Suchagraphisatreeifitisconnected.Fact1.Considerabipartitegraphwithn\rownodesandm= ncountersnodes,whereeach\rownodeconnectstokuniformlysampledcounternodes.Itisaforestwithhighprobabilityi k(k1)[19].Assumethebipartitegraphisaforest.Sincethe\rownodeshavedegreek&#x-391;&#x.690;1,theleavesofthetreeshavetobecounternodes.Theorem1.Forany\rownodeibelongingtoatreecom-ponentinthebipartitegraph,themessagepassingalgorithmconvergestothecorrect\rowestimatesaftera nitenumberofiterations.Inotherwords,foreverya,ai(t),ia(t)andbfi(t)allcoincidewithfiforalltlargeenough. Figure9:ThetreeTairootedatthedirectededgea!i.ItsdepthisDai=2.ProofForsimplicityweproveconvergenceforai(t),astheconvergenceofotherquantitieseasilyfollows.Giventhedirectededgea!i,considerthesubtreeTairootedata!iobtainedbycuttingallthecounternodesadjacenttoibuta,cf.Figure9.Clearlyai(t)onlydependsonthecountervaluesinsideTai,andwerestrictourattentiontothissubtree.LetDaidenotethedepthofTai.WeshallprovebyinductiononDaithatai(t)=fiforanytDai. 4Notethatweimplicitlyassumethattisoddtobeconsis-tentwiththede nitionof()atline8.IfDai=1,thisistriviallytrue:atanytimeai(t)=caandsinceca=fi,thethesisfollows.AssumenowthatthethesisholdsforalldepthsuptoD,andconsiderDai=D+1.Letjbeoneofthe\rowsinTaithathashestocountera,andletbdenoteoneoftheothercounterstowhichitcontributes,cf.Figure9.SincethedepthofthesubtreeTbjisatmostD,bytheinductionhypothesis,bj(t)=fjforanytD.ConsidernowtD+1.Fromthemessagesde nedinExhibit2andthepreviousobservation,itfollowsthatai(t)=caPj=ifj=fiasclaimed. Unfortunately,theuseoftheabovetheoremforCBre-quires k(k1),whichleadstoanenormouswastageofcounters.Wewillnowassumeknowledgeofthe\row-sizedistributionanddramaticallyreduce .Wewillworkwithsparserandomgraphsthatarenotforests,butrathertheywillhavealocallytree-likestructure.5.2SparseRandomGraphItturnsoutthatweareabletocharacterizetherecon-structionerrorprobabilityatt-thiterationofthealgorithmmoreprecisely.Aniceobservationenablesustousetheideaofdensityevolution,developedincodingtheory[21],tocomputetheerrorprobabilityrecursivelyinthelargenlimit.Duetospacelimitation,weareunabletofullyde-scribetheideasofthissection.Wewillbecontenttostatethemaintheoremandmakesomeusefulremarks.Considerabipartitegraphwithn\rownodesandm= ncounternodes,whereeach\rownodeconnectstokuniformlysampledcounternodes.Let\r(x)=1Xi=1e\r(\rx)i1 (i1)!:where\r=nk=mistheaveragedegreeofacounternode.ThedegreedistributionofacounternodeconvergestoaPoissondistributionasn!1,and\r(x)isthegeneratingfunctionforthePoissondistribution.Assumethatwearegiventhe\rowsizedistributionandlet=P(fi�min):Recallthatministheminimumvalueof\rowsizes.Letf(\r;x)=f1\r(1[1\r(1x))]k1)gk1;and\rsupf\r2R:x=f(\r;x)hasnosolution8x2(0;1]g:Theorem2.TheThreshold.Wehave m n=k \rsuchthatinthelargenlimit(i)If � ,bf(2t)"fandbf(2t+1)#f.(ii)If ,thereexistsapositiveproportionof\rowssuchthatbfi(2t)bfi(2t+1)forallt.Thus,some\rowsarenotcorrectlyreconstructed.5 5Intheeventofbfi(2t)bfi(2t+1),weknowthatanerrorhasoccurred.Moreover,bfi(2t)lowerboundsandbfi(2t+1)upperboundsthetruevaluefi. 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 y=x y= f(g,x) Figure10:Densityevolutionasawalkbetweentwocurves.Remark1.Thecharacterizationofthethreshold largelydependsonthelocallytreelikestructureofthesparserandomgraph.Moreprecisely,itmeansthatthegraphcontainsno nite-lengthloopsasn!1.Basedonthis,thedensityevolutionprinciplerecursivelycomputestheer-rorprobabilityaftera nitenumberofiterations,duringwhichallincomingmessagesatanynodeareindependent.Withsomeobservationsspeci ctothisalgorithm,weobtainf(\r;x)astherecursion.Remark2.Thede nitionof\rcanbeunderstoodvi-suallyusingFigure10.Therecursivecomputationofer-rorprobability6correspondstoawalkbetweenthecurvey=f(\r;x)andtheliney=x,wheretwoiterations(evenandodd)correspondtoonestep.If\r\r,y=f(\r;x)isbelowy=x,andthewalkcontinuesallthewayto0,cf.Figure10.Thismeansthatthereconstructionerroris0.If\r&#x-351;&#x.540;\r,y=f(\r;x)intersectsy=xatpointsabove0,andthewalkendsatanon-zerointersectionpoint.Thismeansthatthereisapositiveerrorforanynumberofiterations.Remark3.Theminimumvalueof =p canbeob-tainedafteroptimizingoveralldegreedistributions,includ-ingirregularones.Forthespeci cbipartitegraphinCB,where\rownodeshaveregulardegreekandcounternodeshavePoissondistributeddegrees,weobtain\r=1 p ; =2p ;fork=2.Thevaluesof\rand fordi erentkarelistedinTable2forP(fi�x)=x1:5:Theoptimumvaluep =0:595inthiscase.Thevaluek=3achievesthelowest among2k7,whichis18%morethantheoptimum.k 234567 \r 1:694:235:416:216:827:32  1:180:710:740:800:880:96Table2:Single-layerratefor2k7.P(fi�x)=x1:5:6.MULTI-LAYERDESIGNGivenaspeci c\rowsizedistribution(oranupperboundontheempiricaltaildistribution),wehaveageneralalgo-rithmthatoptimizesthenumberofbitsper\rowinCounter 6Moreprecisely,itreferstotheprobabilitythatanoutgoingmessageisinerror.Braidsoverthefollowingparameters:(1)numberoflayers,(2)numberofhashfunctionsineachlayer,(3)depthofcoun-tersineachlayerand(4)theuseofstatusbits.Wepresentbelowtheresultsfromtheoptimization. 1 2 3 5 10 50 number of layers, LSpace in bits per flow, r a=1.5 a=1.1 a=0.6 Figure11:Optimizedspaceagainstnumberoflayers.(i)Twolayersareusuallysucient.Figure11showsthedecreaseoftotalspace(numberofbitsper\row)asthenumberoflayersincreases,forpower-lawdistributionsP(fi�x)=x with =1:5;1:1and0:6:Fordistributionswithrelativelylighttails,suchas =1:5or1:1,twolayersaccomplishthemajorpartofspacereduction;whereasforheaviertails,suchas =0:6,threelayershelpreducespacefurther.Notethatthedistributionwith =0:6hasveryheavytails.Forinstance,the\rowdistributionsfromrealInternettraces,suchasthoseplottedin[16],has 2.Hencetwolayerssuceformostnetworktrac.(ii)3hashfunctionsisoptimalfortwo-layerCB.Weoptimizedtotalspaceoverthenumberofhashfunctionsineachlayerforatwo-layerCB.Using3hashfunctionsinbothlayersachievestheminimumspace.Fixingk=3andusingthetracdistribution,wecan nd accordingtoTheorem2.Thenumberofcountersinlayer1ism1= n,wherenisthenumberof\rows.(iii)Layer-1counterdepthandnumberoflayer-2counters.Thereisatradeo betweenthedepthoflayer-1countersandthenumberoflayer-2counters,sinceshallowlayer-1countersover\rowmoreoften.Formostnetworktracwith 1:1,4or5bitsinlayer1suce.Fordistributionswithheaviertails,suchas =1,theoptimaldepthis7to8bits.Sincelayer-2countersaremuchdeeperthanlayer-1counters,itisusuallyfavorabletohaveatleastoneorderfewercountersinlayer2.(iv)Statusbitsarehelpful.Weconsideratwo-layerCBandcomparetheoptimizedratewithandwithoutstatusbits.Sizingsthatachievethemin-imumratewith =1:5andmaximum\rowsize13aresummarizedbelow.Hererdenotesthetotalnumberofbitsper\row. idenotesthenumberofcountersper\rowinthei-thlayer.d1denotesthenumberofbitsinthe rstlayer,(inthetwo-layercase,d2=maximum\rowsized1).kidenotesthenumberofhashfunctionsinthei-thlayer.CBwithstatusbitsachievessmallertotalspace,r.Similarre-sultsareobservedwithothervaluesof andmaximum\rowsize. r 1 2d1k1k2 statusbit 4:130:710:065433 nostatusbit 4:660:710:14533Wesummarizetheaboveasthefollowingrulesofthumb.1.Useatwo-layerCBwithstatusbitsand3hashfunc-tionsateachlayer.2.Empiricallyestimate(orguessbasedonhistoricaldata)theheavy-tailexponent andthemax\rowsize.3.Compute accordingtoTheorem2.Setm1= nandm2=0:1 n.4.Use5-bitcountersatlayer1for 1:1,and8-bitcountersfor 1:1.Usedeepenoughcountersatlayer2sothatthelargest\rowisaccommodated(ingeneral,64-bitcountersatlayer-2aredeepenough).7.EVALUATIONWeevaluatetheperformanceofCounterBraidsusingbothrandomlygeneratedtracesandrealInternettraces.InSection7.1wegeneratearandomgraphandarandomsetof\rowsizesforeachrunofexperiment.Weusen=1000andareabletoaveragethereconstructionerror,Perr,andtheaverageerrormagnitude,Em,overenoughroundssothattheirstandarddeviationislessthan1=10oftheirmagnitude.InSection7.2weuse5-minutesegmentsoftwoone-hourcontiguousInternettracesandgeneratearandomgraphforeachsegment.Wereportresultsfortheentiredurationoftwohours.ThereconstructionerrorPerristhetotalnumberoferrorsdividedbythetotalnumberof\rows,andtheav-erageerrormagnitudeEmmeasureshowbigthedeviationfromtheactual\rowsizeisprovidedanerrorhasoccurred.7.1PerformanceFirst,wecomparetheperformanceofone-layerandtwo-layerCB.Weuse1000\rowsrandomlygeneratedfromthedistributionP(fi&#x-3.6;┠x)=x1:5,whoseentropyisalittlelessthan3bits.Wevarythetotalnumberofbitsper\rowinCBandcomputePerrandEm.Inallexperiments,weuseCBwith3hashfunctions.Forthetwo-layerCB,weuse4-bitdeeplayer-1counterswithstatusbits.TheresultsareshowninFigure12.Thepointslabelled1-layerand2-layerthresholdrespec-tivelyareasymptoticthresholdcomputedusingdensityevo-lution.Weobservethatwith1000\rows,thereisasharpdecreaseinPerraroundthisasymptoticthreshold.Indeed,theerrorislessthan1in1000whenthenumberofbitsper\rowis1bitabovetheasymptoticthreshold.Withalargernumberof\rows,thedecreasearoundthresholdisexpectedtobeevensharper.Similarly,onceabovethethreshold,theaverageerrormag-nitudeforboth1-layerand2-layerCounterBraidsiscloseto1,theminimummagnitudeofanerror.Whenbelowthethreshold,theaverageerrormagnitudeincreasesonlylin-earlyasthenumberofbitsdecreases.At1bitper\row,wehave4050%\rowsincorrectlydecoded,buttheaverageer-rormagnitudeisonlyabout5.Thismeansthatmany\rowestimatesarenotfarfromthetruevalues.Together,weseethatthe2-layerCBhasmuchbetterperformancethanthe1-layerCBwiththesamespace.Asweincreasethenumberoflayers,theasymptoticthreshold 0 1 2 3 4 5 6 7 8 9 10-4 10-3 10-2 10-1 100 bits per flowReconstruction Error, Perr one layer two layers entropy 2-layer threshold 1-layerthreshold 0 1 2 3 4 5 6 7 8 9 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 bits per flowAverage Error Magnitude, Em Figure12:Performanceoveravaryingnumberofbitsper\row.willmoveclosertoentropy.However,weobservethatthe2-layerCBhasalreadyaccomplishedmostofthegain. 0 5 10 15 20 25 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 proportion of flows estimated incorrectlynumber of iterations below threshold at threshold above threshold Count-Min Figure13:Performanceovernumberofiterations.NotethatPerrforaCount-MinsketchwiththesamespaceasCBishigh.Next,weinvestigatethenumberofiterationsrequiredtoreconstructthe\rows.Figure13showstheremainingpro-portionofincorrectlydecoded\rowsasthenumberofiter-ationsincreases.Theexperimentsarerunfor1000\rowswiththesamedistributionasabove,onaone-layerCounterBraids.Thenumberofbitsper\rowischosentobebelow,atandabovetheasymptoticthreshold.Aspredictedbyden-sityevolution,Perrdecreasesexponentiallyandconvergesto0atorabovetheasymptoticthreshold,andconvergestoapositiveerrorwhenbelowthreshold.Inthisexperiment,10iterationsaresucienttorecovermostofthe\rows. 7.2TraceSimulationWeusetwoOC-48(2.5Gbps)one-hourcontiguoustracesataSanJoserouter.Trace1wascollectedonWednesday,Jan15,2003,10amto11am,hencerepresentativeofweek-daytrac.Trace2wascollectedonThurApr24,2003,12amto1am,hencerepresentativeofnight-timetrac.Wedivideeachtraceinto125-minutesegments,correspondingtoameasurementepoch.Figure14plotsthetaildistribu-tion(P(fi�x))forallsegments.Althoughthelinerateisnothigh,thenumberofactive\rowsisalreadysigni -cant.Eachsegmentintrace1hasapproximately0:9million\rowsand20millionpackets,andeachsegmentintrace2hasapproximately0:7million\rowsand9millionpackets.Thestatisticsacrossdi erentsegmentswithinonetracearesimilar. 100 101 102 103 104 105 10-6 10-5 10-4 10-3 10-2 10-1 100 tail distribution. P(fi�x)flow size in packets trace 1, 12 segments Figure14:Taildistribution.Trace1hasheaviertracthantrace2andalsoaheaviertail.Infact,itistheheaviesttracewehaveencounteredsofar,andismuchheavierthan,forinstance,tracesplottedin[16].Theproportionofone-packet\rowsintrace1isonly0:11,similartothatofapower-lawdistributionwith =0:17.Flowswithsizelargerthan10packetsaredistributedsimilartoapowerlawwith 1.We xthesamesizingofCBforallsegments,mimickingtherealisticscenariowheretracvariesovertimeandCBisbuiltinhardware.Wepresenttheproportionof\rowsinerrorPerrandtheaverageerrormagnitudeEmforbothtracestogether.WevarythetotalnumberofbitsinCB7,denotedbyB,andpresenttheresultinTable3.Forallexperiments,weuseatwo-layerCBwithstatusbits,and3hashfunctionsatbothlayers.Thelayer-1coun-tersare8bitsdeepandthelayer-2countersare56bitsdeep.B(MB) 1:21:31:351:4 Perr 0:330:250:150 Em 31:91:20Table3:Simulationresultsofcounting2tracesin5-minutesegments,ona xed-sizeCBwithtotalspaceB.WeobserveasimilarphenomenonasinFigure12.Asweunderprovidespace,thereconstructionerrorincreasessigni cantly.However,theerrormagnituderemainssmall.Forthesetwotraces,1:4MBissucienttocountall\rowscorrectlyin5-minutesegments. 7Wearenotusingbitsper\rowheresincethenumberof\rowsisdi erentindi erentsegments.8.IMPLEMENTATION8.1On-ChipUpdatesEachlayerofCBcanbebuiltonaseparateblockofSRAMtoenablepipelining.Onpre-builtmemories,thecounterdepthischosentobeanintegerfractionofthewordlength,soastomaximizespaceusage.Thisconstraintdoesnotexistwithcustom-madememories.Weneedalistof\rowlabelstoconstructthe rst-layergraphforreconstruction.Incaseswhereaccessfrequenciesforpre- xesor ltersarebeingcollected,the\rownodesaresimplythesetofpre- xesor ltercriteria,whicharethesameacrossallmeasurementepochs.Henceno\rowlabelsneedtobecollectedortransferred.Inothercaseswherethe\rowlabelsareelementsofalargespace(e.g.\row5-tuples),thelabelsneedtobecollectedandtransferredtothedecodingunit.Themethodforcollecting\rowlabelsisapplication-speci c,andmaydependontheparticularimplementationoftheapplication.Wegivethefollowingsuggestionforcollecting\row5-tuplesinaspeci cscenario.ForTCP\rows,a\rowlabelcanbewrittentoaDRAMwhichmaintains\rowIDswhena\rowisestablished;forexample,whena\SYN"packetarrives.Since\rowsarees-tablishedmuchlessfrequentlythanpacketarrivals(approx-imatelyonein40packetscausesa\rowtobesetup[10]),thesememoryaccessesdonotcreateabottleneck.Flowsthatspanboundariesofmeasurementepochscanbeidenti- edusingaBloomFilter[3].Finally,weevaluatedthealgorithmbymeasuring\rowsizesinpackets.Thealgorithmcanbeusedtomeasure\rowsizesinbytes.Sincemostbyte-countingisreallythecountingofbyte-chunks(e.g.32or64byte-chunks),thereisthequestionofchoosingthe\rightgranularity":asmallvaluegivesaccuratecountsbutusesmorespaceandviceversa.Weareworkingonaniceapproachtothisproblemandwillreportresultsinfuturepublications.8.2ComputationCostofDecoderWereconstructthe\rowsizesusingtheiterativemessagepassingalgorithminanoineunit.Thedecodingcom-plexityislinearinthenumberof\rows.DecodingCBwithmorethanonelayerimposesonlyasmalladditionalcost,sincethehigherlayersare12orderssmallerthanthe rstlayer.Forexample,decoding1million\rowsonatwo-layerCounterBraidstakes,onaverage,15secondsona2:6GHzmachine.9.CONCLUSIONANDFURTHERWORKWepresentedCounterBraids,aecientminimum-spacecounterarchitecture,thatsolveslarge-scalenetworkmea-surementproblemssuchasper-\rowandper-pre xcounting.CounterBraidsincrementallycompressesthe\rowsizesasitcountsandthemessagepassingreconstructionalgorithmrecovers\rowsizesalmostperfectly.Weminimizecounterspacewithincrementalcompression,andsolvethe\row-to-counterassociationproblemusingrandomgraphs.Asshownfromrealtracesimulations,weareabletocountupto1mil-lion\rowspurelyinSRAMandrecovertheexact\rowsizes.WearecurrentlyimplementingthisinanFPGAtodeter-minetheactualmemoryusageandtobetterunderstandimplementationissues. Severaldirectionsareopenforfurtherexploration.Wementiontwo:(i)Sincea\rowpassesthroughmultiplerouters,andsinceouralgorithmisamenabletoadistributedimple-mentation,itwillsavecounterspacedramaticallytocom-binethecountscollectedatdi erentrouters.(ii)Sinceouralgorithm\degradesgracefully,"inthesensethatiftheamountofspaceislessthantherequiredamount,wecanstillrecovermany\rowsaccuratelyandhaveerrorsofknownsizeonafew,itisworthstudyingthegracefuldegradationformallyasa\lossycompression"problem.Acknowledgement:SupportforOC-48datacollectionisprovidedbyDARPA,NSF,DHS,CiscoandCAIDAmem-bers.ThisworkhasbeensupportedinpartbyNSFGrantNumber0653876,forwhichwearethankful.WealsothanktheCleanSlateProgramatStanfordUniversity,andtheStanfordGraduateFellowshipprogramforsupportingpartofthiswork.10.REFERENCES[1]http://www.cisco.com/warp/public/732/Tech/net\row.[2]Junipernetworkssolutionsfornetworkaccounting.www.juniper.net/techcenter/appnote/350003.html.[3]B.Bloom.Space/timetrade-o sinhashcodingwithallowableerrors.Comm.ACM,13,July1970.[4]J.W.Byers,M.Luby,M.Mitzenmacher,andA.Rege.Adigitalfountainapproachtoreliabledistributionofbulkdata.InSIGCOMM,pages56{67,1998.[5]G.Caire,S.Shamai,andS.Verdu.Noiselessdatacompressionwithlowdensityparitycheckcodes.InDIMACS,NewYork,2004.[6]E.CandesandT.Tao.Nearoptimalsignalrecoveryfromrandomprojectionsanduniversalencodingstrategies.IEEETrans.Inform.Theory,2004.[7]G.CormodeandS.Muthukrishnan.Animproveddatastreamsummary:thecount-minsketchanditsapplications.JournalofAlgorithms,55(1),April2005.[8]T.M.CoverandJ.A.Thomas.ElementsofInformationTheory.Wiley,NewYork,1991.[9]M.CrovellaandA.Bestavros.Self-similarityinworldwidewebtrac:Evidenceandpossiblecauses.IEEE/ACMTrans.Networking,1997.[10]S.DharmapurikarandV.Paxson.Robusttcpstreamreassemblyinthepresenceofadversaries.14thUSENIXSecuritySymposium,2005.[11]D.Donoho.Compressedsensing.IEEETrans.Inform.Theory,52(4),April2006.[12]C.EstanandG.Varghese.Newdirectionsintracmeasurementandaccounting.Proc.ACMSIGCOMMInternetMeasurementWorkshop,pages75{80,2001.[13]R.G.Gallager.Low-DensityParity-CheckCodes.MITPress,Cambridge,Massachussetts.[14]M.GrossglauserandJ.Rexford.Passivetracmeasurementforipoperations.TheInternetasaLarge-ScaleComplexSystem,2002.[15]F.Kschischang,B.Frey,andH.-A.Loeliger.Factorgraphsandthesum-productalgorithm.IEEETrans.Inform.Theory,47:498{519,2001.[16]A.Kumar,M.Sung,J.J.Xu,andJ.Wang.Datastreamingalgorithmsforecientandaccurateestimationof\rowsizedistribution.ProceedingsofACMSIGMETRICS,2004.[17]Y.Lu,A.Montanari,andB.Prabhakar.Detailednetworkmeasurementsusingsparsegraphcounters:Thetheory.AllertonConference,September2007.[18]M.Luby,M.Mitzenmacher,A.Shokrollahi,D.A.Spielman,andV.Stemann.Practicalloss-resilientcodes.InProc.ofSTOC,pages150{159,1997.[19]M.MezardandA.Montanari.ConstraintsatisfactionnetworksinPhysicsandComputation.InPreparation.[20]S.RamabhadranandG.Varghese.Ecientimplementationofastatisticscounterarchitecture.Proc.ACMSIGMETRICS,pages261{271,2003.[21]T.RichardsonandR.Urbanke.ModernCodingTheory.CambridgeUniversityPress,2007.[22]D.Shah,S.Iyer,B.Prabhakar,andN.McKeown.Analysisofastatisticscounterarchitecture.Proc.IEEEHotI9.[23]Q.G.Zhao,J.J.Xu,andZ.Liu.Designofanovelstatisticscounterarchitecturewithoptimalspaceandtimeeciency.SIGMetrics/Performance,June2006.Appendix:AsymptoticOptimalityWestatetheresultonasymptoticoptimalitywithoutaproof.Thecompleteproofcanbefoundin[17].Wemaketwoassumptionsonthe\rowsizedistributionp:1.Ithasatmostpower-lawtails.BythiswemeanthatPffixgAxforsomeconstantAandsome�0.Thisisavalidassumptionfornetworkstatistics[9].2.Ithasdecreasingdigitentropy.Writefiinitsq-aryexpansionPa0fi(a)qa.Lethl=PxP(fi(l)=x)logqP(fi(l)=x)betheq-aryentropyoffi(l).Thenhlismonotonicallydecreasinginlforanyqlargeenough.Wecalladistributionpwiththesetwopropertiesadmis-sible.Thisclassincludesmostcasesofpracticalinterest.Forinstance,anypower-lawdistributionisadmissible.The(binary)entropyofthisdistributionisdenotedbyH2(p)Pxp(x)log2p(x).Forthissectiononly,weassumethatallcountersinCBhaveanequaldepthofdbits.Letq=2.Definition2.WerepresentCBasasparsegraphG,withverticesconsistingofn\rowsandatotalofm(n)coun-tersinalllayers.AsequenceofCountersBraidsfGnghasdesignraterifr=limn!1m(n) nlog2q:(1)ItisreliableforthedistributionpifthereexistsasequenceofreconstructionfunctionsbFnbFGnsuchthatPerr(Gn;bFn)PfbFn(c)=fgn!0:(2)Hereisthemaintheorem:Theorem3.Foranyadmissibleinputdistributionp,andanyrater�H2(p)thereexistsasequenceofreliablesparseCounterBraidswithasymptoticrater.ThetheoremissatisfyingasitshowsthattheCBarchi-tectureisfundamentallygoodintheinformation-theoreticsense.Despitebeingincrementalandlinear,itisasgoodas,forexample,Hu mancodes,atin niteblocklength.