lustanfordedu Andrea Montanari Departments of EE and Stats Stanford University montanarstanfordedu Balaji Prabhakar Departments of EE and CS Stanford University balajistanfordedu Sarang Dharmapurikar Nuova Systems Inc San Jose California sarangnuovas ID: 30272
Download Pdf The PPT/PDF document "Counter Braids A Novel Counter Architect..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
CounterBraids:ANovelCounterArchitectureforPer-FlowMeasurementYiLuDepartmentofEEStanfordUniversityyi.lu@stanford.eduAndreaMontanariDepartmentsofEEandStatsStanfordUniversitymontanar@stanford.eduBalajiPrabhakarDepartmentsofEEandCSStanfordUniversitybalaji@stanford.eduSarangDharmapurikarNuovaSystems,IncSanJose,Californiasarang@nuovasystems.comAbdulKabbaniDepartmentofEEStanfordUniversityakabbani@stanford.eduABSTRACTFine-grainednetworkmeasurementrequiresroutersandswitchestoupdatelargearraysofcountersatveryhighlinkspeed(e.g.40Gbps).AnaivealgorithmneedsaninfeasibleamountofSRAMtostoreboththecountersanda\row-to-counterassociationrule,sothatarrivingpacketscanupdatecorrespondingcountersatlinkspeed.Thishasmadeaccu-rateper-\rowmeasurementcomplexandexpensive,andmo-tivatedapproximatemethodsthatdetectandmeasureonlythelarge\rows.Thispaperrevisitstheproblemofaccurateper-\rowmea-surement.Wepresentacounterarchitecture,calledCounterBraids,inspiredbysparserandomgraphcodes.Inanut-shell,CounterBraids\compresseswhilecounting".Itsolvesthecentralproblems(counterspaceand\row-to-counteras-sociation)ofper-\rowmeasurementby\braiding"ahierarchyofcounterswithrandomgraphs.Braidingresultsindrasticspacereductionbysharingcountersamong\rows;andus-ingrandomgraphsgeneratedon-the-\rywithhashfunctionsavoidsthestorageof\row-to-counterassociation.TheCounterBraidsarchitectureisoptimal(albeitwithacomplexdecoder)asitachievesthemaximumcompressionrateasymptotically.Forimplementation,wepresentalow-complexitymessagepassingdecodingalgorithm,whichcanrecover\rowsizeswithessentiallyzeroerror.EvaluationonInternettracesdemonstratesthatalmostall\rowsizesarerecoveredexactlywithonlyafewbitsofcounterspaceper\row.CategoriesandSubjectDescriptorsC.2.3[ComputerCommunicationNetworks]:NetworkOperations-NetworkMonitoring;E.1[DataStructures]Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.SIGMETRICS'08,June26,2008,Annapolis,Maryland,USA.Copyright2008ACM978-1-60558-005-0/08/06...$5.00.GeneralTermsMeasurement,Algorithms,Theory,PerformanceKeywordsStatisticsCounters,NetworkMeasurement,MessagePass-ingAlgorithms1.INTRODUCTIONThereisanincreasingneedforne-grainednetworkmea-surementtoaidthemanagementoflargenetworks[14].Net-workmeasurementconsistsofcountingthesizeofalogicalentitycalled\\row",ataninterfacesuchasarouter.A\rowisasequenceofpacketsthatsatisfyacommonsetofrules.Forinstance,packetswiththesamesource(destination)ad-dressconstitutea\row.Measuring\rowsofthistypegivesthevolumeofupload(download)byauserandisusefulforaccountingandbillingpurposes.Measuring\rowswithaspecic\row5-tupleinthepacketheadergivesmorede-tailedinformationsuchasroutingdistributionandtypesoftracinthenetwork.Suchinformationcanhelpgreatlywithtracengineeringandbandwidthprovisioning.Flowscanalsobedenedbypacketclassication.Forexample,ICMPEchopacketsusedfornetworkattacksforma\row.Measuringsuch\rowsisusefulduringandafteranattackforanomalydetectionandnetworkforensics.Currentlythereexistsnolarge-scalestatisticscounterar-chitecturethatisbothcheapandaccurate.Thisismainlyduetothelackofaordablehigh-densityhigh-bandwidthmemorydevices.Toillustratetheproblem,theprocessingtimefora64-bytepacketata40-GbpsOC-768linkis12ns.ThisrequiresmemorieswithaccesstimemuchsmallerthanthatofcommerciallyavailableDRAM(whoseaccesstimeistensofnsec),andmakesitnecessarytoemploySRAMs.However,duetotheirlowdensity,largeSRAMsareexpen-siveanddiculttoimplementon-chip.Itis,therefore,es-sentialtondacounterarchitecturethatminimizesmemoryspace.Therearetwomaincomponentsofthetotalspacere-quirement:1.Counterspace.Assumingthatamilliondistinct\rowsareobservedinaninterval1andusingone64-bitcounter 1OurOC-48(2:5Gbps)tracedatashowthatareabout900;000distinct\row5-tuplesina5-minuteinterval.On40-Gbpslinks,therecaneasilybeanexcessofamilliondis- per\row(astandardvendorpractice[20]),8MBofSRAMisneededforcounterspacealone.2.Flow-to-counterassociationrule.Thesetofactive\rowsvariesovertime,andthe\row-to-counterassociationruleneedstobedynamicallyconstructed.Forasmallnum-berof\rows,acontent-addressable-memory(CAM)isusedinmostapplications.However,thehighpowerconsumptionandheatdissipationofCAMsforbidtheiruseinrealisticscenarios,andSRAMhashtablesareusedtostorethe\row-to-counterassociationrule.Thisrequiresatleastanother10MBofSRAM.Thelargespacerequirementnotonlyconsiderablyin-creasesthecostoflinecards,butalsohindersacompactlayoutofchipsduetothelowdensityofSRAM.1.1PreviousApproachesThewideapplicabilityandinherentdicultyofdesign-ingstatisticscountershaveattractedtheattentionoftheresearchcommunity.Therearetwomainapproaches:(i)ExactcountingusingahybridSRAM-DRAMarchitecture,and(ii)approximatecountingbyexploitingtheheavy-tailnatureof\rowsizedistribution.Wereviewtheseapproachesbelow.Exactcounting.Shahet.al.[22]proposedandanalyzedahybridarchitecture,takingtherststeptowardsanim-plementablelarge-scalecounterarray.Thearchitecturecon-sistsofshallowcountersinfastSRAManddeepcountersinslowDRAM.ThechallengeistondasimplealgorithmforupdatingtheDRAMcounterssothatnoSRAMcounterover\rowsinbetweentwoDRAMupdates.Thealgorithmanalyzedin[22]wassubsequentlyimprovedbyRamabhad-ranandVarghese[20]andZhaoet.al.[23].Thisreducedthealgorithmcomplexity,makingitfeasibletouseasmallSRAMwith5bitsper\rowtocount\rowsizesinpackets(notbytes).However,allthepapersabovesuerfromthefollowingdrawbacks:(i)deep(typically64bitsper\row)o-chipDRAMcountersareneeded,(ii)costlySRAM-to-DRAMupdatesarerequired,and(iii)the\row-to-counterassociationproblemisassumedtobesolvedusingaCAMorahashtable.Inparticular,theydonotaddressthe\row-to-counterassociationproblem.Approximatecounting.Tokeepcostacceptable,prac-ticalsolutionsfromtheindustryandacademicresearchei-thersacricetheaccuracyorlimitthescopeofmeasure-ment.Forexample,Cisco'sNet\row[1]countsboth5-tuplesandper-prex\rowsbasedonsampling,whichintroducesasignicant9%relativeerrorevenforlarge\rowsandmoreerrorsforsmaller\rows[12].JuniperNetworksintroducedlter-basedaccounting[2]tocountalimitedsetof\rowspre-denedmanuallybyoperators.The\sample-and-hold"solu-tionproposedbyEstanandVarghesein[12],whileachievinghighaccuracy,measuresonly\rowsthatoccupymorethan0:1%ofthetotalbandwidth.EstanandVarghese'sapproachintroducedtheideaofexploitingtheheavy-tail\rowsizedis-tribution:sinceafewlarge\rowsbringmostofthedata,itisfeasibletoquicklyidentifytheselarge\rowsandmeasuretheirsizesonly. tinct\row5-tuplesinashortobservationinterval.Or,formeasuringthefrequencyofprexaccesses,oneneedsabout500;000counters,whichisthecurrentsizeofIPv4routingtables[20].Futureroutersmayeasilysupportmorethanamillionprexes.1.2OurApproachThemaincontributionofthispaperisanSRAM-onlylarge-scalecounterarchitecturewiththefollowingfeatures:1.Flow-to-counterassociationusingasmallnumber(e.g.3)ofhashfunctions.2.Incrementalcompressionof\rowsizesaspacketsarrive;onlyasmallnumber(e.g.3)ofcountersareaccessedateachpacketarrival.3.Asymptoticoptimality.Wehaveprovedin[17]thatCounterBraids(CB),withanoptimal(butNP-hard)decoder,hasanasymptoticcompressionratematchingtheinformationtheoreticlimit.TheresultissurprisingsinceCBformsarestrictivefamilyofcompressors.4.Alinear-complexitymessagepassingdecodingalgo-rithmthatrecoversall\rowsizesfromcompressedcountswithessentiallyzeroerror.TotalspaceinCBneededforexactrecoveryisclosetotheoptimalcompressionof\rowsizes.5.Themessagepassingalgorithmisanalyzable,enablingthechoiceofdesignparametersfordierenthardwarerequirement.Remark:WenotethatCBhasthedisadvantageofnotsupportinginstantaneousqueriesof\rowsizes.All\rowsizesaredecodedtogetherattheendofameasurementepoch.Weplantoaddressthisprobleminfuturework.Informaldescription.CounterBraidsisahierarchyofcountersbraidedviarandomgraphsintandem.Figure1(a)showsanaivecounterarchitecturethatstoresve\rowsizesincountersofequaldepth,whichhastoexceedthesizeofthelargest\row.Eachbitinacounterisshownasacircle.Theleastsignicantbit(LSB)istheoneclosesttothe\rownode.Filledcirclesrepresenta1,andunlledcirclesa0.Thisstructureleadstoanenormouswastageofspacebecausethemajorityof\rowsaresmall.Figure1(b)showsCBforstoringthesame\rowsizes.Itisworthnotingthat:(i)CBhasfewer\moresignicantbits"andtheyaresharedamongall\rows,and(ii)theexact\rowsizescanbeobtainedby\decoding"thebitpattenstoredinCB.Acomparisonofthetwoguresclearlyshowsagreatreductioninspace.1.3RelatedTheoreticalLiteratureCompressedSensing.TheideaofCounterBraidsisthe-maticallyrelatedtocompressedsensing[6,11],whosecentralinnovationissummarizedbythefollowingquote:Sincewecan\throwaway"mostofourdataandstillbeabletoreconstructtheoriginalwithnoperceptualloss(aswedowithubiquitoussound,imageanddatacompressionformats,)whycan'twedirectlymeasurethepartthatwillnotendupbeing\thrownaway"?[11]Forthenetworkmeasurementproblem,weobtainavec-torofcountervalues,c,viaCB,fromthe\rowsizesf.Iffhasasmallentropy,thevectorcoccupiesmuchlessspacethanf;itconstitutes\thepart(off)thatwillnotendupbeingthrownaway."Ano-chipdecodingalgorithmthenrecoversffromc.WhileCompressedSensingandCBare 2351 (a) 351321 (b)Figure1:(a)Asimplecounterstructure.(b)CounterBraids.(lledcircle=1,unlledcircle=0).thematicallyrelated,theyaremethodologicallyquitedif-ferent:CompressedSensingcomputesrandomlineartrans-formationsofthedataandusesLP(linearprogramming)reconstructionmethods;whereasCBusesamulti-layerednon-linearstructureandamessagepassingreconstructionalgorithm.Sparserandomgraphcodes.CounterBraidsismethod-ologicallyinspiredbythetheoryoflow-densityparitycheck(LDPC)codes[13,21].SeealsorelatedliteraturesonTor-nadocodes[18]andFountaincodes[4].Fromtheinforma-tiontheoreticperspective,thedesignofanecientcount-ingschemeandagood\rowsizeestimationisequivalenttothedesignofanecientcompressor,orasourcecode[8].However,thenetworkmeasurementproblemimposesastringentconstraintonsuchacode:eachtimethesizeofa\rowchanges(becauseanewpacketarrives),asmallnumberofoperationsmustbesucienttoupdatethecompressedin-formation.Thisisnotthecasewithstandardsourcecodes(suchastheLempel-Zivalgorithm),wherechangingasin-glebitinthesourcestreammaycompletelyalterthecom-pressedversion.WendthattheclassofsourcecodesdualtoLDPCcodes[5]workwellunderthisconstraint;usingfeaturesofthesecodesmakesCBagood\incrementalcom-pressor."ThereisaprobleminusingthedesignofLDPCcodesfornetworkmeasurement:withtheheavy-taileddistribution,the\rowsizesareaprioriunbounded.Inthechannelcodinglanguage,thisisequivalenttousingacountablebutinniteinputalphabet.Asaresult,newideasaredevelopedforprovingtheachievabilityofoptimalasymptoticcompressionrate.Thefullproofiscontainedin[17]andwestatethetheoremintheappendixforcompleteness.Thelargealphabetsizealsomakesiterativemessagepass-ingdecodingalgorithms[15],suchasBeliefPropagation,highlycomplextoimplement,asBPpassesprobabilitiesratherthannumbers.Inthispaper,wepresentanovelmes-sagepassingdecodingalgorithmoflowcomplexitythatiseasytoimplement.Thesub-optimalityofthemessagepass-ingalgorithmnaturallyrequiresmorecounterspacethantheinformationtheoreticlimit.Wecharacterizethemini-mumspacerequiredforzeroasymptoticdecodingerrorus-ing\densityevolution"[21].ThespacerequirementcanbefurtheroptimizedwithrespecttothenumberoflayersinCounterBraids,andthedegreedistributionofeachlayer.Theoptimizedspaceisclosetotheinformationtheoreticlimit,enablingCBtotintosmallSRAM.Count-MinSketch.LikeCounterBraids,theCount-Minsketch[7]fordatastreamapplicationsisalsoarandomhash-basedstructure.WithCount-Min,each\rowhashestoandupdatesdcounters;theminimumvalueofthedcountersisretrievedasthe\rowestimate.TheCount-Minsketchprovidesprobabilisticguaranteesfortheestimationerror:withatleast1 probability,theestimationerrorislessthanjfj1,wherejfj1isthesumofall\rowsizes.Tohavesmalland,thenumberofcountersneedstobelarge.TheCount-MinsketchisdierentfromCounterBraidsinthefollowingways:(a)Thereisno\braiding"ofcounters,hencenocompression.(b)TheestimationalgorithmfortheCount-Minsketchisone-step,whereasitisiterativeforCB.Infact,comparingtheCount-Minalgorithmtoourrecon-structionalgorithmonaone-layerCB,itiseasytoseethattheestimatebyCount-Minisexactlytheestimateaftertherstiterationofouralgorithm.Thus,CBperformsatleastaswellastheCount-Minalgorithm.2(c)Ourreconstruc-tionalgorithmdetectserrors.Thatis,itcandistinguishthe\rowswhosesizesareincorrectlyestimated,andproduceanupperandlowerboundofthetruevalue;whereastheCount-Minsketchonlyguaranteesanover-estimate.(d)CBneedstodecodeallthe\rowsizesatonce,unliketheCount-Minalgorithmwhichcanestimateasingle\rowsize.Thus,Count-MinisbetterathandlingonlinequeriesthanCB.StructurallyrelatedtoCounterBraids(randomhashingof\rowsintocountersandarecoveryalgorithm)istheworkofKumaret.al.[16].Thegoalofthatworkistoestimatethe\rowsizedistributionandnottheactual\rowsizes,whichisouraim.InSection2,wedenethegoalsofthispaperandoutlineoursolutionmethodology.Section3describestheCounterBraidsarchitecture.Themessagepassingdecodingalgo-rithmisdescribedinSection4andanalyzedinSection5.Section6exploresthechoiceofparametersformulti-layerCB.ThealgorithmisevaluatedusingtracesinSection7.WediscussimplementationissuesinSection8andoutlinefurtherworkinSection9.2.PROBLEMFORMULATIONWedividetimeintomeasurementepochs(e.g.5minutes).Theobjectiveistocountthenumberofpacketsper\rowforallactive\rowswithinameasurementepoch.Wedonotdealwiththebyte-countingprobleminthispaperduetospacelimitation,butthereisnoconstraintinusingCounterBraidsforbyte-counting.Goals:AsmentionedinSection1,themainproblemswewishtoaddressare:(i)compacting(oreliminating)thespaceusedby\row-to-counterassociationrule,and(ii)sav-ingcounterspaceandincrementallycompressingthecounts. 2ThisissimilartothebenetofTurbocodesoverconven-tionalsoft-decisiondecodingalgorithmsandillustratesthepowerofthe\Turboprinciple." Additionally,wewouldlike(iii)alow-complexityalgorithmtoreconstruct\rowsizesattheendofameasurementepoch.Solutionmethodology:Correspondingtothegoals,we(i)useasmallnumberofhashfunctions,(ii)braidthecoun-ters,and(iii)usealinear-complexitymessage-passingalgo-rithmtoreconstruct\rowsizes.Inparticular,byusingasmallnumberofhashfunctions,weeliminatetheneedforstoringa\row-to-counterassociationrule.Performancemeasures:(1)Space:measuredinnumberofbitsper\rowoccupiedbycounters.Wedenoteitbyr(tosuggestcompressionrateasintheinformationtheoryliterature.)Notethatthenumberofcountersisnotthecorrectmeasureofcompressionrate;rather,itisthenumberofbits.(2)Reconstructionerror:measuredasthefractionof\rowswhosereconstructedvalueisdierentfromthetruevalue:Perr1 nnXi=1Ifbfi=fig;wherenisthetotalnumberof\rows,bfiistheestimatedsizeof\rowiandfithetruesize.Iistheindicatorfunc-tion,whichreturns1iftheexpressioninthebracketistrueand0otherwise.Wechosethismetricsincewewantexactreconstruction.(3)Averageerrormagnitude:denedastheratioofthesumofabsoluteerrorsandthenumberoferrors:Em=Pijfi bfij PiI(fi=bfi):Itmeasureshowbiganerroriswhenanerrorhasoccurred.Thestatementofasymptoticoptimalityintheappendixyieldsthatitispossibletokeepspaceequaltothe\row-sizeentropy,andhavereconstructionerrorgoingto0asthenumberof\rowsgoestoinnity.Bothanalysis(Section5)andsimulations(Section7)showthatwithourlow-complexitymessagepassingdecodingal-gorithm,wecankeepspaceclosetothe\row-sizeentropyandobtainessentiallyzeroreconstructionerror.Inaddi-tion,thealgorithmoersagraciousdegradationoferrorwhenspaceisfurtherreduced,evenbelowthe\row-sizeen-tropy.Althoughreconstructionerrorbecomessignicant,averageerrormagnituderemainssmall,whichmeansthatmost\row-sizeestimatesareclosetotheirtruevalues.3.OURSOLUTIONTheoverallarchitectureofoursolutionisshowninFigure2.EacharrivingpacketupdatesCounterBraidsinon-chipSRAM.Thisconstitutestheencodingstageifweviewmea-surementascompression.Attheendofameasurementepoch,thecontentofCounterBraids,i.e.,thecompressedcounts,aretransferredtoanoineprocessingunit,suchasaPC.Areconstructionalgorithmthenrecoversthelistof\rowID,size-3.6;â¹pairs.WedescribeCBinSection3.1andspecifythemappingthatsolvesthe\row-to-counterassociationprobleminSec-tion3.2.Wedescribetheupdatingscheme,ortheon-chipencodingalgorithm,inSection3.3,leavingthedescriptionofthereconstructionalgorithmtoSection4. Figure2:SystemDiagram.3.1CounterBraidsCounterBraidshasalayeredstructure.Thel-thlayerhasmlcounterswithadepthofdlbits.LetthetotalnumberoflayersbeL.Inpractice,L=2isusuallysucientaswillbeshowninSection6.Figure3illustratesthecasewhereL=2.Foracompletedescriptionofthestructure,weleaveLasaparameter. Figure3:Two-layerCounterBraidswithtwohashfunc-tionsandstatusbits.WewillshowinlatersectionsthatwecanuseadecreasingnumberofcountersineachlayerofCB,andstillbeabletorecoverthe\rowsizescorrectly.Theideaisthatgivenaheavy-taildistributionfor\rowsizes,themoresignicantbitsinthecountersarepoorlyutilized;sincebraidingallowsmoresignicantbitstobesharedamongall\rows,areducednumberofcountersinthehigherlayerssuce.Figure3alsoshowsanoptionalfeatureofCB,thestatusbits.Astatusbitisanadditionalbitonarst-layercounter.Itissetto1afterthecorrespondingcounterrstover\rows.CounterBraidswithoutstatusbitsistheoreticallysucient:theasymptoticoptimalityresultintheappendixisshownwithoutstatusbits,assumingahigh-complexityoptimalde-coder.However,inpracticeweusealow-complexitymes-sagepassingdecoder,andtheparticularshapeofthenet-worktracdistributionisbetterexploitedwithstatusbits.Statusbitsoccupyadditionalspace,butprovideusefulin-formationtothemessage-passingdecodersothatthenum-berofsecond-layercounterscanbefurtherreduced,yield-ingafavorabletradeoinspace.Statusbitsaretakenintoaccountwhencomputingthetotalspace;inparticular,itguresintheperformancemeasure,r,\spaceinnumberof bitsper\row."InCBwithmorethantwolayers,everylayerexceptthelastwillhavecounterswithstatusbits.3.2TheRandom(Hash)MappingsWeusethesamerandommappingintwosettings:(i)between\rowsandtherst-layercounters,and(ii)betweentwoconsecutivelayersofcounters.ThedashedarrowsinFigure3illustrateboth(i)and(ii)(whichisbetweentherstandsecondlayercounters.)Considertherandommappingbetween\rowsandthelayer-1counters.Foreach\rowID,weapplykpseudo-randomhashfunctionswithacommonrangef0;;m1 1g,wherem1isthenumberofcountersinlayer1,asillustratedinFig-ure3(withk=2.)Themappinghasthefollowingfeatures:1.Itisdynamicallyconstructedforavaryingsetofac-tive\rows,byapplyinghashfunctionsto\rowIDs.Inotherwords,nomemoryspaceisneededtodescribethemappingexplicitly.Thestorageforthe\row-to-counterassociationissimplythesizeofdescriptionofthekhashfunctionsanddoesnotincreasewiththenum-berof\rowsn.2.Thenumberofhashfunctionskissettoasmallcon-stant(e.g.3).Thisallowscounterstobeupdatedwithonlyasmallnumberofoperationsatapacketarrival.Remark.Notethatthemappingdoesnothaveanyspecialstructure.Inparticular,itisnotbijective.Thisnecessi-tatestheuseofareconstructionalgorithmtorecoverthe\rowsizes.Usingk1addsredundancytothemappingandmakesrecoverypossible.However,therandommappingdoesmorethansimplifyingthe\row-to-counterassociation.Infact,itperformsthecompressionof\rowsizesintocountervaluesandreducescounterspace.Nextconsidertherandommappingbetweentwoconsec-utivelayersofcounters.Foreachcounterlocation(intherangef0;;ml 1g)inthel-thlayer,weapplykhashfunctionstoobtainthecorresponding(l+1)-thlayercounterlocations(intherangef0;;ml+1 1g).ItisillustratedinFigure3withk=2.Theuseofhashfunctionsenablesustoimplementthemappingwithoutextracircuitsinthehardware;andtherandommappingfurthercompressesthecountsinlayer-2counters.3.3Encoding:TheUpdatingAlgorithmTheinitializationandupdateproceduresofatwo-layerCounterBraidswith2hashfunctionsateachlayerarespec-iedinExhibit1.Theproceduresincludeboththegener-ationofrandommappingusinghashfunctionsandtheup-datingscheme.Whenapacketarrives,bothcountersits\rowlabelhashesintoareincremented.Andwhenacounterinlayer1over\rows,bothcountersinlayer2ithashesintoareincrementedby1,likeacarry-over.Theover\rowingcounterisresetto0andthecorrespondingstatusbitissetto1.Itisevidentfromtheexhibitthattheamountofupdat-ingrequiredisverysmall.Yetaftereachupdate,thecoun-tersstoreacompressedversionofthemostup-to-date\rowsizes.Theincrementalnatureofthiscompressionalgorithmismadepossiblewiththeuseofrandomsparselinearcodes,whichweshallfurtherexploitatthereconstructionstage. Exhibit1:TheUpdateAlgorithm 1:Initialize2:forlayerl=1to23:forcounteri=1toml4:counters[l][i]=05:Update6:Uponthearrivalofapacketpkt7:idx1=hash-function1(pkt);8:idx2=hash-function2(pkt);9:counters[1][idx1]=counter[1][idx1]+1;10:counters[1][idx2]=counter[1][idx2]+1;11:ifcounters[1][idx1]over\rows,12:Updatesecond-layercounters(idx1);13:ifcounters[1][idx2]over\rows,14:Updatesecond-layercounters(idx2)15:Updatesecond-layercounters(idx)16:statusbit[1][idx]=1;17:idx3=hash-function3(idx);18:idx4=hash-function4(idx);19:counters[2][idx3]=counter[2][idx3]+1;20:counters[2][idx4]=counter[2][idx4]+1 Theupdateofthesecond-layercounterscanbepipelined.Itcanbeexecutedtogetherwiththenextupdateoftherst-layercounters.Ingeneral,pipeliningcanbeusedforCBwithmultiplelayers. Figure4:Atoyexampleforupdating.Numbersnextto\rownodesarecurrent\rowsizes.Dottedlinesindi-catehashfunctions.Thicklinesindicatehashfunctionsbeingcomputedbyanarrivingpacket.The\rowwithanarrivingpacketisindicatedbyanarrow.Figure4illustratestheupdatingalgorithmwithatoyex-ample.(a)showstheinitialstateofCBwithtwo\rows.In(b),anew\rowarrives,bringingtherstpacket;alayer-1counterover\rowsandupdatestwolayer-2counters.In(c),apacketofanexisting\rowarrivesandnoover\rowoccurs.In(d),anotherpacketofanexisting\rowarrivesandanotherlayer-1counterover\rows. 4.MESSAGEPASSINGDECODERThesparsityoftherandomgraphs3inCBopensthewaytousinglow-complexitymessagepassingalgorithmsforre-constructionof\rowsizes,butthedesignofsuchanalgo-rithmisnotobvious.InthecaseofLDPCcodes,messagepassingdecodingalgorithmsholdthepromiseofapproach-ingcapacitywithunprecedentedlylowcomplexity.However,thealgorithmsusedincoding,suchasBeliefPropagation,haveincreasingmemoryrequirementasthealphabetsizegrows,sinceBPpassesprobabilitydistributionsinsteadofsinglenumbers.Wedevelopanovelmessagepassingalgo-rithmthatissimpletoimplementoncountablealphabets.4.1OneLayerConsidertherandommappingbetween\rowsandtherst-layercounters.Itisabipartitegraphwith\rownodesontheleftandcounternodesontheright,asshowninFigure5.Anedgeconnects\rowiandcounteraifoneofthekhashfunctionsmaps\rowitocountera.Thevectorfdenotes\rowsizesandcdenotescountervalues.ca=Xi2@afi;where@adenotesallthe\rowsthathashintocountera.Theproblemistoestimatefgivenc. Figure5:Messagepassingonabipartitegraphwith\rownodes(circles)andcounternodes(rectangles.)Messagepassingalgorithmsareiterative.Inthetthiter-ationmessagesarepassedfromall\rownodestoallcounternodesandthenbackinthereversedirection.Amessagegoesfrom\rowitocountera(denotedbyia)andviceversa(de-notedbyai)onlyifnodesiandaareneighbors(connectedbyanedge)onthebipartitegraph.OuralgorithmisdescribedinExhibit2.Themessagesia(0)areinitializedto0,althoughanyinitialvaluelessthantheminimum\rowsize,min,willworkjustaswell.Theinterpretationofthemessagesisasfollows:aiconveyscountera'sguessof\rowi'ssizebasedontheinformationitreceivedfromneighboring\rowsotherthan\rowi.Con-versely,iaistheguessby\rowiofitsownsize,basedontheinformationitreceivedfromneighboringcountersotherthancountera.Remark1.Sinceia(0)=0,ai(1)=caandbfi(1)=minafcag; 3EachrandommappinginCBisarandombipartitegraphwithedgesgeneratedbythekhashfunctions.Itissparsebecausethenumberofedgesislinearinthenumberofnodes,asopposedtoquadraticforacompletebipartitegraph. Exhibit2:TheMessagePassingDecodingAlgorithm 1:Initialize2:min=minimum\rowsize;3:ia(0)=08iand8a;4:ca=athcountervalue5:Iterations6:foriterationnumbert=1toT7:ai(t)=maxnca Pj=ija(t 1);mino;8:ia(t)=minb=abi(t)iftisodd,maxb=abi(t)iftiseven.9:FinalEstimate10:bfi(T)=minafai(T)gifTisodd,maxafai(T)gifTiseven. Figure6:Thedecodingalgorithmover4iterations.Numbersinthetopmostgurearetrue\rowsizesandcountervalues.Inaniteration,numbersnexttoanodearemessagesonitsoutgoingedges,fromtoptobot-tom.Eachiterationinvolvesmessagesgoingfrom\rowstocountersandbackfromcountersto\rows.whichispreciselytheestimateoftheCount-Minalgorithm.Thus,theestimateofCount-Ministheestimateofourmessage-passingalgorithmaftertherstiteration.Remark2.Thedistinctionbetweenoddandevenitera-tionsatline8and10isduetothe\anti-monotonicityprop-erty"ofthemessage-passingalgorithm,tobediscussedinSection5.Remark3.Itturnsoutthatthealgorithmremainsun-changediftheminimumormaximumatline8isoverallincomingmessages,thatis,ia(t)=minbbi(t)iftisodd,maxbbi(t)iftiseven.Thechangewillsavesomecomputationsinimplementation.Theproofofthisfactandensuinganalyticalconsequencesisdeferredtoforthcomingpublications.Inthispaper,westicktothealgorithminExhibit2. Toyexample.Figure6showstheevolutionofmessagesover4iterationsonatoyexample.Inthisparticularexam-ple,all\rowsizesarereconstructedcorrectly.Notethatweareusingdierentdegreesatsome\rownodes.Ingeneral,thisgivespotentiallybetterperformancethanall\rownodeshavingthesamedegree,butwewillsticktothelatterinthispaperforitseaseofimplementation.The\rowestimatesateachiterationarelistedinTable1.Allmessagesconvergein4iterationsandtheestimatesatIteration1(secondcolumn)istheCount-Minestimate.iteration 01234 bf1 034111 bf2 034111 bf3 032323232Table1:Flowestimatesateachiteration.AllmessagesconvergeafterIteration3.4.2Multi-layerMulti-layerCounterBraidsaredecodedrecursively,onelayeratatime.Itisconceptuallyhelpfultoconstructanewsetof\rowsflforlayer-lcountersbasedonthecountervaluesatlayer(l 1).Thepresenceofstatusbitsaectsthedenitionoffl. Figure7:Withoutstatusbits,\rowsinf2haveaone-to-onemaptoallcounterinc1. Figure8:Withstatusbits,\rowsinf2haveaone-to-onemaptoonlycountersthathaveover\rown(whosestatusbitsaresetto1).Figure7illustratestheconstructionoff2whentherearenostatusbits.Thevectorf2hasaone-to-onemaptocoun-tersinlayer1,anda\rowsizeinf2equalsthenumberoftimesthecorrespondingcounterhasover\rown,withtheminimumvalue0.Figure8illustratestheconstructionoff2whentherearestatusbits.Thevectorf2nowhasaone-to-onecorrespon-dencewithonlythosecountersinlayer1thathaveover-\rown;thatis,counterswhosestatusbitsaresetto1.Thenew\rowsizeisstillthenumberoftimesthecorrespondingcounterover\rows,butinthiscase,theminimumvalueis1.Itisclearfromthegurethattheuseofstatusbitseec-tivelyreducesthenumberof\rownodesinlayer2.Hence,fewercountersareneededinlayer2forgooddecodability.Thisreductionincounterspaceatlayer2tradesowiththeadditionalspaceneededforstatusbitsthemselves!AsweshallseeinSection6,whenthenumberoflayersinCBissmall,thetradeofavorstheuseofstatusbits.The\rowsizesaredecodedrecursively,startingfromthetopmostlayer.Forexample,afterdecodingthelayer-2\\rows,"weaddtheirsizes(theamountofover\rowfromlayer-1coun-ters)tothevaluesoflayer-1counters.Wethenusethenewvaluesoflayer-1counterstodecodethe\rowsizes.DetailsofthealgorithmarepresentedinExhibit3. Exhibit3:TheMulti-layerAlgorithm 1:forl=Lto12:constructthegraphforlthlayerasinFigure7ifwithoutstatusbits;asinFigure8ifwithstatusbits;3:decodeflfromclasinExhibit24:cl 1=cl 1+fl2l 1wheredl 1isthecounterdepthinbitsatlayer(l 1) 5.SINGLE-LAYERANALYSISThedecodingalgorithmworksonelayeratatime;hence,werstanalyzethesingle-layermessagepassingalgorithmanddetermineitsraterandreconstructionerrorprobabilityPerr.Thisanalysislaysthefoundationforthedesignofmulti-layerCounterBraids,tobepresentedinSection6.Sinceallcountersinlayer1havethesamedepthd1,averyrelevantquantityfortheanalysisisthenumberofcountersper\row:m=n;wheremisthenumberofcountersandnisthenumberof\rows.Thecompressionrateinbitsper\rowisgivenbyr=d1.ThebipartitegraphinFigure5willbethefocusofstudy,asitspropertiesdeterminetheperformanceofthealgorithm.Lemma1.TogglingProperty.Ifia(t 1)fiforeveryianda,thenai(t)fiandia(t)fi.Conversely,ifia(t 1)fiforeveryianda,thenai(t)fiandia(t)fi.Theproofofthislemmafollowssimplyfromthedenitionofandandisomitted.Lemma2.Anti-monotonicityProperty.Ifand0aresuchthatforeveryianda,ia(t 1)0ia(t 1)fi,thenia(t)0ia(t)fi.Consequently,sincebf(0)=0,bf(2t)fcomponent-wiseandbf(2t)iscomponent-wisenon-decreasing.Similarlybf(2t+1)fandiscomponent-wisenon-increasing. Proof.Itfollowsfromline7ofExhibit2that,ifia(t 1)0ia(t 1)fi,thenai(t)0ai(t)fi.4Fromthisandthedenitionsofandbfatlines8and10ofExhibit2,therestofthelemmafollows. Theabovelemmasgiveapowerfulconclusion:Thetruevalueofthe\row-sizevectorissandwichedbetweenmonoton-icallyincreasinglowerboundsandmonotonicallydecreasingupperbounds.Thequestion,therefore,is:Convergence:Whendoesthesandwichclose?Thatis,underwhatconditionsdoesthemessagepassingalgorithmconverge?Wegivetwoanswers.Therstisgeneral,notrequiringanyknowledgeofthe\row-sizedistribution.Thesecondusesthe\row-sizedistribution,butgivesamuchbetteranswer.Indeed,oneobtainsanexactthresholdfortheconvergenceofthealgorithm:Forthealgorithmconverges,andforitfailstoconverge(i.e.thesandwichdoesnotclose.)5.1MessagePassingonTreesDefinition1.Agraphisaforestifforallnodesinthegraph,thereexistsnopathofnon-vanishinglengththatstartsandendsatthesamenode.Inotherwords,thegraphcon-tainsnoloops.Suchagraphisatreeifitisconnected.Fact1.Considerabipartitegraphwithn\rownodesandm=ncountersnodes,whereeach\rownodeconnectstokuniformlysampledcounternodes.Itisaforestwithhighprobabilityik(k 1)[19].Assumethebipartitegraphisaforest.Sincethe\rownodeshavedegreek-391;.690;1,theleavesofthetreeshavetobecounternodes.Theorem1.Forany\rownodeibelongingtoatreecom-ponentinthebipartitegraph,themessagepassingalgorithmconvergestothecorrect\rowestimatesafteranitenumberofiterations.Inotherwords,foreverya,ai(t),ia(t)andbfi(t)allcoincidewithfiforalltlargeenough. Figure9:ThetreeTairootedatthedirectededgea!i.ItsdepthisDai=2.ProofForsimplicityweproveconvergenceforai(t),astheconvergenceofotherquantitieseasilyfollows.Giventhedirectededgea!i,considerthesubtreeTairootedata!iobtainedbycuttingallthecounternodesadjacenttoibuta,cf.Figure9.Clearlyai(t)onlydependsonthecountervaluesinsideTai,andwerestrictourattentiontothissubtree.LetDaidenotethedepthofTai.WeshallprovebyinductiononDaithatai(t)=fiforanytDai. 4Notethatweimplicitlyassumethattisoddtobeconsis-tentwiththedenitionof()atline8.IfDai=1,thisistriviallytrue:atanytimeai(t)=caandsinceca=fi,thethesisfollows.AssumenowthatthethesisholdsforalldepthsuptoD,andconsiderDai=D+1.Letjbeoneofthe\rowsinTaithathashestocountera,andletbdenoteoneoftheothercounterstowhichitcontributes,cf.Figure9.SincethedepthofthesubtreeTbjisatmostD,bytheinductionhypothesis,bj(t)=fjforanytD.ConsidernowtD+1.FromthemessagesdenedinExhibit2andthepreviousobservation,itfollowsthatai(t)=ca Pj=ifj=fiasclaimed. Unfortunately,theuseoftheabovetheoremforCBre-quiresk(k 1),whichleadstoanenormouswastageofcounters.Wewillnowassumeknowledgeofthe\row-sizedistributionanddramaticallyreduce.Wewillworkwithsparserandomgraphsthatarenotforests,butrathertheywillhavealocallytree-likestructure.5.2SparseRandomGraphItturnsoutthatweareabletocharacterizetherecon-structionerrorprobabilityatt-thiterationofthealgorithmmoreprecisely.Aniceobservationenablesustousetheideaofdensityevolution,developedincodingtheory[21],tocomputetheerrorprobabilityrecursivelyinthelargenlimit.Duetospacelimitation,weareunabletofullyde-scribetheideasofthissection.Wewillbecontenttostatethemaintheoremandmakesomeusefulremarks.Considerabipartitegraphwithn\rownodesandm=ncounternodes,whereeach\rownodeconnectstokuniformlysampledcounternodes.Let\r(x)=1Xi=1e \r(\rx)i 1 (i 1)!:where\r=nk=mistheaveragedegreeofacounternode.ThedegreedistributionofacounternodeconvergestoaPoissondistributionasn!1,and\r(x)isthegeneratingfunctionforthePoissondistribution.Assumethatwearegiventhe\rowsizedistributionandlet=P(fimin):Recallthatministheminimumvalueof\rowsizes.Letf(\r;x)=f1 \r(1 [1 \r(1 x))]k 1)gk 1;and\rsupf\r2R:x=f(\r;x)hasnosolution8x2(0;1]g:Theorem2.TheThreshold.Wehavem n=k \rsuchthatinthelargenlimit(i)If,bf(2t)"fandbf(2t+1)#f.(ii)If,thereexistsapositiveproportionof\rowssuchthatbfi(2t)bfi(2t+1)forallt.Thus,some\rowsarenotcorrectlyreconstructed.5 5Intheeventofbfi(2t)bfi(2t+1),weknowthatanerrorhasoccurred.Moreover,bfi(2t)lowerboundsandbfi(2t+1)upperboundsthetruevaluefi. 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 y=x y= f(g,x) Figure10:Densityevolutionasawalkbetweentwocurves.Remark1.Thecharacterizationofthethresholdlargelydependsonthelocallytreelikestructureofthesparserandomgraph.Moreprecisely,itmeansthatthegraphcontainsnonite-lengthloopsasn!1.Basedonthis,thedensityevolutionprinciplerecursivelycomputestheer-rorprobabilityafteranitenumberofiterations,duringwhichallincomingmessagesatanynodeareindependent.Withsomeobservationsspecictothisalgorithm,weobtainf(\r;x)astherecursion.Remark2.Thedenitionof\rcanbeunderstoodvi-suallyusingFigure10.Therecursivecomputationofer-rorprobability6correspondstoawalkbetweenthecurvey=f(\r;x)andtheliney=x,wheretwoiterations(evenandodd)correspondtoonestep.If\r\r,y=f(\r;x)isbelowy=x,andthewalkcontinuesallthewayto0,cf.Figure10.Thismeansthatthereconstructionerroris0.If\r-351;.540;\r,y=f(\r;x)intersectsy=xatpointsabove0,andthewalkendsatanon-zerointersectionpoint.Thismeansthatthereisapositiveerrorforanynumberofiterations.Remark3.Theminimumvalueof=p canbeob-tainedafteroptimizingoveralldegreedistributions,includ-ingirregularones.ForthespecicbipartitegraphinCB,where\rownodeshaveregulardegreekandcounternodeshavePoissondistributeddegrees,weobtain\r=1 p ;=2p ;fork=2.Thevaluesof\randfordierentkarelistedinTable2forP(fix)=x 1:5:Theoptimumvaluep =0:595inthiscase.Thevaluek=3achievesthelowestamong2k7,whichis18%morethantheoptimum.k 234567 \r 1:694:235:416:216:827:32 1:180:710:740:800:880:96Table2:Single-layerratefor2k7.P(fix)=x 1:5:6.MULTI-LAYERDESIGNGivenaspecic\rowsizedistribution(oranupperboundontheempiricaltaildistribution),wehaveageneralalgo-rithmthatoptimizesthenumberofbitsper\rowinCounter 6Moreprecisely,itreferstotheprobabilitythatanoutgoingmessageisinerror.Braidsoverthefollowingparameters:(1)numberoflayers,(2)numberofhashfunctionsineachlayer,(3)depthofcoun-tersineachlayerand(4)theuseofstatusbits.Wepresentbelowtheresultsfromtheoptimization. 1 2 3 5 10 50 number of layers, LSpace in bits per flow, r a=1.5 a=1.1 a=0.6 Figure11:Optimizedspaceagainstnumberoflayers.(i)Twolayersareusuallysucient.Figure11showsthedecreaseoftotalspace(numberofbitsper\row)asthenumberoflayersincreases,forpower-lawdistributionsP(fix)=x with=1:5;1:1and0:6:Fordistributionswithrelativelylighttails,suchas=1:5or1:1,twolayersaccomplishthemajorpartofspacereduction;whereasforheaviertails,suchas=0:6,threelayershelpreducespacefurther.Notethatthedistributionwith=0:6hasveryheavytails.Forinstance,the\rowdistributionsfromrealInternettraces,suchasthoseplottedin[16],has2.Hencetwolayerssuceformostnetworktrac.(ii)3hashfunctionsisoptimalfortwo-layerCB.Weoptimizedtotalspaceoverthenumberofhashfunctionsineachlayerforatwo-layerCB.Using3hashfunctionsinbothlayersachievestheminimumspace.Fixingk=3andusingthetracdistribution,wecanndaccordingtoTheorem2.Thenumberofcountersinlayer1ism1=n,wherenisthenumberof\rows.(iii)Layer-1counterdepthandnumberoflayer-2counters.Thereisatradeobetweenthedepthoflayer-1countersandthenumberoflayer-2counters,sinceshallowlayer-1countersover\rowmoreoften.Formostnetworktracwith1:1,4or5bitsinlayer1suce.Fordistributionswithheaviertails,suchas=1,theoptimaldepthis7to8bits.Sincelayer-2countersaremuchdeeperthanlayer-1counters,itisusuallyfavorabletohaveatleastoneorderfewercountersinlayer2.(iv)Statusbitsarehelpful.Weconsideratwo-layerCBandcomparetheoptimizedratewithandwithoutstatusbits.Sizingsthatachievethemin-imumratewith=1:5andmaximum\rowsize13aresummarizedbelow.Hererdenotesthetotalnumberofbitsper\row.idenotesthenumberofcountersper\rowinthei-thlayer.d1denotesthenumberofbitsintherstlayer,(inthetwo-layercase,d2=maximum\rowsize d1).kidenotesthenumberofhashfunctionsinthei-thlayer.CBwithstatusbitsachievessmallertotalspace,r.Similarre-sultsareobservedwithothervaluesofandmaximum\rowsize. r12d1k1k2 statusbit 4:130:710:065433 nostatusbit 4:660:710:14533Wesummarizetheaboveasthefollowingrulesofthumb.1.Useatwo-layerCBwithstatusbitsand3hashfunc-tionsateachlayer.2.Empiricallyestimate(orguessbasedonhistoricaldata)theheavy-tailexponentandthemax\rowsize.3.ComputeaccordingtoTheorem2.Setm1=nandm2=0:1n.4.Use5-bitcountersatlayer1for1:1,and8-bitcountersfor1:1.Usedeepenoughcountersatlayer2sothatthelargest\rowisaccommodated(ingeneral,64-bitcountersatlayer-2aredeepenough).7.EVALUATIONWeevaluatetheperformanceofCounterBraidsusingbothrandomlygeneratedtracesandrealInternettraces.InSection7.1wegeneratearandomgraphandarandomsetof\rowsizesforeachrunofexperiment.Weusen=1000andareabletoaveragethereconstructionerror,Perr,andtheaverageerrormagnitude,Em,overenoughroundssothattheirstandarddeviationislessthan1=10oftheirmagnitude.InSection7.2weuse5-minutesegmentsoftwoone-hourcontiguousInternettracesandgeneratearandomgraphforeachsegment.Wereportresultsfortheentiredurationoftwohours.ThereconstructionerrorPerristhetotalnumberoferrorsdividedbythetotalnumberof\rows,andtheav-erageerrormagnitudeEmmeasureshowbigthedeviationfromtheactual\rowsizeisprovidedanerrorhasoccurred.7.1PerformanceFirst,wecomparetheperformanceofone-layerandtwo-layerCB.Weuse1000\rowsrandomlygeneratedfromthedistributionP(fi-3.6;â x)=x 1:5,whoseentropyisalittlelessthan3bits.Wevarythetotalnumberofbitsper\rowinCBandcomputePerrandEm.Inallexperiments,weuseCBwith3hashfunctions.Forthetwo-layerCB,weuse4-bitdeeplayer-1counterswithstatusbits.TheresultsareshowninFigure12.Thepointslabelled1-layerand2-layerthresholdrespec-tivelyareasymptoticthresholdcomputedusingdensityevo-lution.Weobservethatwith1000\rows,thereisasharpdecreaseinPerraroundthisasymptoticthreshold.Indeed,theerrorislessthan1in1000whenthenumberofbitsper\rowis1bitabovetheasymptoticthreshold.Withalargernumberof\rows,thedecreasearoundthresholdisexpectedtobeevensharper.Similarly,onceabovethethreshold,theaverageerrormag-nitudeforboth1-layerand2-layerCounterBraidsiscloseto1,theminimummagnitudeofanerror.Whenbelowthethreshold,theaverageerrormagnitudeincreasesonlylin-earlyasthenumberofbitsdecreases.At1bitper\row,wehave40 50%\rowsincorrectlydecoded,buttheaverageer-rormagnitudeisonlyabout5.Thismeansthatmany\rowestimatesarenotfarfromthetruevalues.Together,weseethatthe2-layerCBhasmuchbetterperformancethanthe1-layerCBwiththesamespace.Asweincreasethenumberoflayers,theasymptoticthreshold 0 1 2 3 4 5 6 7 8 9 10-4 10-3 10-2 10-1 100 bits per flowReconstruction Error, Perr one layer two layers entropy 2-layer threshold 1-layerthreshold 0 1 2 3 4 5 6 7 8 9 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 bits per flowAverage Error Magnitude, Em Figure12:Performanceoveravaryingnumberofbitsper\row.willmoveclosertoentropy.However,weobservethatthe2-layerCBhasalreadyaccomplishedmostofthegain. 0 5 10 15 20 25 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 proportion of flows estimated incorrectlynumber of iterations below threshold at threshold above threshold Count-Min Figure13:Performanceovernumberofiterations.NotethatPerrforaCount-MinsketchwiththesamespaceasCBishigh.Next,weinvestigatethenumberofiterationsrequiredtoreconstructthe\rows.Figure13showstheremainingpro-portionofincorrectlydecoded\rowsasthenumberofiter-ationsincreases.Theexperimentsarerunfor1000\rowswiththesamedistributionasabove,onaone-layerCounterBraids.Thenumberofbitsper\rowischosentobebelow,atandabovetheasymptoticthreshold.Aspredictedbyden-sityevolution,Perrdecreasesexponentiallyandconvergesto0atorabovetheasymptoticthreshold,andconvergestoapositiveerrorwhenbelowthreshold.Inthisexperiment,10iterationsaresucienttorecovermostofthe\rows. 7.2TraceSimulationWeusetwoOC-48(2.5Gbps)one-hourcontiguoustracesataSanJoserouter.Trace1wascollectedonWednesday,Jan15,2003,10amto11am,hencerepresentativeofweek-daytrac.Trace2wascollectedonThurApr24,2003,12amto1am,hencerepresentativeofnight-timetrac.Wedivideeachtraceinto125-minutesegments,correspondingtoameasurementepoch.Figure14plotsthetaildistribu-tion(P(fix))forallsegments.Althoughthelinerateisnothigh,thenumberofactive\rowsisalreadysigni-cant.Eachsegmentintrace1hasapproximately0:9million\rowsand20millionpackets,andeachsegmentintrace2hasapproximately0:7million\rowsand9millionpackets.Thestatisticsacrossdierentsegmentswithinonetracearesimilar. 100 101 102 103 104 105 10-6 10-5 10-4 10-3 10-2 10-1 100 tail distribution. P(fix)flow size in packets trace 1, 12 segments Figure14:Taildistribution.Trace1hasheaviertracthantrace2andalsoaheaviertail.Infact,itistheheaviesttracewehaveencounteredsofar,andismuchheavierthan,forinstance,tracesplottedin[16].Theproportionofone-packet\rowsintrace1isonly0:11,similartothatofapower-lawdistributionwith=0:17.Flowswithsizelargerthan10packetsaredistributedsimilartoapowerlawwith1.WexthesamesizingofCBforallsegments,mimickingtherealisticscenariowheretracvariesovertimeandCBisbuiltinhardware.Wepresenttheproportionof\rowsinerrorPerrandtheaverageerrormagnitudeEmforbothtracestogether.WevarythetotalnumberofbitsinCB7,denotedbyB,andpresenttheresultinTable3.Forallexperiments,weuseatwo-layerCBwithstatusbits,and3hashfunctionsatbothlayers.Thelayer-1coun-tersare8bitsdeepandthelayer-2countersare56bitsdeep.B(MB) 1:21:31:351:4 Perr 0:330:250:150 Em 31:91:20Table3:Simulationresultsofcounting2tracesin5-minutesegments,onaxed-sizeCBwithtotalspaceB.WeobserveasimilarphenomenonasinFigure12.Asweunderprovidespace,thereconstructionerrorincreasessignicantly.However,theerrormagnituderemainssmall.Forthesetwotraces,1:4MBissucienttocountall\rowscorrectlyin5-minutesegments. 7Wearenotusingbitsper\rowheresincethenumberof\rowsisdierentindierentsegments.8.IMPLEMENTATION8.1On-ChipUpdatesEachlayerofCBcanbebuiltonaseparateblockofSRAMtoenablepipelining.Onpre-builtmemories,thecounterdepthischosentobeanintegerfractionofthewordlength,soastomaximizespaceusage.Thisconstraintdoesnotexistwithcustom-madememories.Weneedalistof\rowlabelstoconstructtherst-layergraphforreconstruction.Incaseswhereaccessfrequenciesforpre-xesorltersarebeingcollected,the\rownodesaresimplythesetofpre-xesorltercriteria,whicharethesameacrossallmeasurementepochs.Henceno\rowlabelsneedtobecollectedortransferred.Inothercaseswherethe\rowlabelsareelementsofalargespace(e.g.\row5-tuples),thelabelsneedtobecollectedandtransferredtothedecodingunit.Themethodforcollecting\rowlabelsisapplication-specic,andmaydependontheparticularimplementationoftheapplication.Wegivethefollowingsuggestionforcollecting\row5-tuplesinaspecicscenario.ForTCP\rows,a\rowlabelcanbewrittentoaDRAMwhichmaintains\rowIDswhena\rowisestablished;forexample,whena\SYN"packetarrives.Since\rowsarees-tablishedmuchlessfrequentlythanpacketarrivals(approx-imatelyonein40packetscausesa\rowtobesetup[10]),thesememoryaccessesdonotcreateabottleneck.Flowsthatspanboundariesofmeasurementepochscanbeidenti-edusingaBloomFilter[3].Finally,weevaluatedthealgorithmbymeasuring\rowsizesinpackets.Thealgorithmcanbeusedtomeasure\rowsizesinbytes.Sincemostbyte-countingisreallythecountingofbyte-chunks(e.g.32or64byte-chunks),thereisthequestionofchoosingthe\rightgranularity":asmallvaluegivesaccuratecountsbutusesmorespaceandviceversa.Weareworkingonaniceapproachtothisproblemandwillreportresultsinfuturepublications.8.2ComputationCostofDecoderWereconstructthe\rowsizesusingtheiterativemessagepassingalgorithminanoineunit.Thedecodingcom-plexityislinearinthenumberof\rows.DecodingCBwithmorethanonelayerimposesonlyasmalladditionalcost,sincethehigherlayersare1 2orderssmallerthantherstlayer.Forexample,decoding1million\rowsonatwo-layerCounterBraidstakes,onaverage,15secondsona2:6GHzmachine.9.CONCLUSIONANDFURTHERWORKWepresentedCounterBraids,aecientminimum-spacecounterarchitecture,thatsolveslarge-scalenetworkmea-surementproblemssuchasper-\rowandper-prexcounting.CounterBraidsincrementallycompressesthe\rowsizesasitcountsandthemessagepassingreconstructionalgorithmrecovers\rowsizesalmostperfectly.Weminimizecounterspacewithincrementalcompression,andsolvethe\row-to-counterassociationproblemusingrandomgraphs.Asshownfromrealtracesimulations,weareabletocountupto1mil-lion\rowspurelyinSRAMandrecovertheexact\rowsizes.WearecurrentlyimplementingthisinanFPGAtodeter-minetheactualmemoryusageandtobetterunderstandimplementationissues. Severaldirectionsareopenforfurtherexploration.Wementiontwo:(i)Sincea\rowpassesthroughmultiplerouters,andsinceouralgorithmisamenabletoadistributedimple-mentation,itwillsavecounterspacedramaticallytocom-binethecountscollectedatdierentrouters.(ii)Sinceouralgorithm\degradesgracefully,"inthesensethatiftheamountofspaceislessthantherequiredamount,wecanstillrecovermany\rowsaccuratelyandhaveerrorsofknownsizeonafew,itisworthstudyingthegracefuldegradationformallyasa\lossycompression"problem.Acknowledgement:SupportforOC-48datacollectionisprovidedbyDARPA,NSF,DHS,CiscoandCAIDAmem-bers.ThisworkhasbeensupportedinpartbyNSFGrantNumber0653876,forwhichwearethankful.WealsothanktheCleanSlateProgramatStanfordUniversity,andtheStanfordGraduateFellowshipprogramforsupportingpartofthiswork.10.REFERENCES[1]http://www.cisco.com/warp/public/732/Tech/net\row.[2]Junipernetworkssolutionsfornetworkaccounting.www.juniper.net/techcenter/appnote/350003.html.[3]B.Bloom.Space/timetrade-osinhashcodingwithallowableerrors.Comm.ACM,13,July1970.[4]J.W.Byers,M.Luby,M.Mitzenmacher,andA.Rege.Adigitalfountainapproachtoreliabledistributionofbulkdata.InSIGCOMM,pages56{67,1998.[5]G.Caire,S.Shamai,andS.Verdu.Noiselessdatacompressionwithlowdensityparitycheckcodes.InDIMACS,NewYork,2004.[6]E.CandesandT.Tao.Nearoptimalsignalrecoveryfromrandomprojectionsanduniversalencodingstrategies.IEEETrans.Inform.Theory,2004.[7]G.CormodeandS.Muthukrishnan.Animproveddatastreamsummary:thecount-minsketchanditsapplications.JournalofAlgorithms,55(1),April2005.[8]T.M.CoverandJ.A.Thomas.ElementsofInformationTheory.Wiley,NewYork,1991.[9]M.CrovellaandA.Bestavros.Self-similarityinworldwidewebtrac:Evidenceandpossiblecauses.IEEE/ACMTrans.Networking,1997.[10]S.DharmapurikarandV.Paxson.Robusttcpstreamreassemblyinthepresenceofadversaries.14thUSENIXSecuritySymposium,2005.[11]D.Donoho.Compressedsensing.IEEETrans.Inform.Theory,52(4),April2006.[12]C.EstanandG.Varghese.Newdirectionsintracmeasurementandaccounting.Proc.ACMSIGCOMMInternetMeasurementWorkshop,pages75{80,2001.[13]R.G.Gallager.Low-DensityParity-CheckCodes.MITPress,Cambridge,Massachussetts.[14]M.GrossglauserandJ.Rexford.Passivetracmeasurementforipoperations.TheInternetasaLarge-ScaleComplexSystem,2002.[15]F.Kschischang,B.Frey,andH.-A.Loeliger.Factorgraphsandthesum-productalgorithm.IEEETrans.Inform.Theory,47:498{519,2001.[16]A.Kumar,M.Sung,J.J.Xu,andJ.Wang.Datastreamingalgorithmsforecientandaccurateestimationof\rowsizedistribution.ProceedingsofACMSIGMETRICS,2004.[17]Y.Lu,A.Montanari,andB.Prabhakar.Detailednetworkmeasurementsusingsparsegraphcounters:Thetheory.AllertonConference,September2007.[18]M.Luby,M.Mitzenmacher,A.Shokrollahi,D.A.Spielman,andV.Stemann.Practicalloss-resilientcodes.InProc.ofSTOC,pages150{159,1997.[19]M.MezardandA.Montanari.ConstraintsatisfactionnetworksinPhysicsandComputation.InPreparation.[20]S.RamabhadranandG.Varghese.Ecientimplementationofastatisticscounterarchitecture.Proc.ACMSIGMETRICS,pages261{271,2003.[21]T.RichardsonandR.Urbanke.ModernCodingTheory.CambridgeUniversityPress,2007.[22]D.Shah,S.Iyer,B.Prabhakar,andN.McKeown.Analysisofastatisticscounterarchitecture.Proc.IEEEHotI9.[23]Q.G.Zhao,J.J.Xu,andZ.Liu.Designofanovelstatisticscounterarchitecturewithoptimalspaceandtimeeciency.SIGMetrics/Performance,June2006.Appendix:AsymptoticOptimalityWestatetheresultonasymptoticoptimalitywithoutaproof.Thecompleteproofcanbefoundin[17].Wemaketwoassumptionsonthe\rowsizedistributionp:1.Ithasatmostpower-lawtails.BythiswemeanthatPffixgAx forsomeconstantAandsome0.Thisisavalidassumptionfornetworkstatistics[9].2.Ithasdecreasingdigitentropy.Writefiinitsq-aryexpansionPa0fi(a)qa.Lethl=PxP(fi(l)=x)logqP(fi(l)=x)betheq-aryentropyoffi(l).Thenhlismonotonicallydecreasinginlforanyqlargeenough.Wecalladistributionpwiththesetwopropertiesadmis-sible.Thisclassincludesmostcasesofpracticalinterest.Forinstance,anypower-lawdistributionisadmissible.The(binary)entropyofthisdistributionisdenotedbyH2(p) Pxp(x)log2p(x).Forthissectiononly,weassumethatallcountersinCBhaveanequaldepthofdbits.Letq=2.Definition2.WerepresentCBasasparsegraphG,withverticesconsistingofn\rowsandatotalofm(n)coun-tersinalllayers.AsequenceofCountersBraidsfGnghasdesignraterifr=limn!1m(n) nlog2q:(1)ItisreliableforthedistributionpifthereexistsasequenceofreconstructionfunctionsbFnbFGnsuchthatPerr(Gn;bFn)PfbFn(c)=fgn!0:(2)Hereisthemaintheorem:Theorem3.Foranyadmissibleinputdistributionp,andanyraterH2(p)thereexistsasequenceofreliablesparseCounterBraidswithasymptoticrater.ThetheoremissatisfyingasitshowsthattheCBarchi-tectureisfundamentallygoodintheinformation-theoreticsense.Despitebeingincrementalandlinear,itisasgoodas,forexample,Humancodes,atinniteblocklength.