/
Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of

Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of - PDF document

briana-ranney
briana-ranney . @briana-ranney
Follow
571 views
Uploaded On 2014-12-27

Counter Braids A Novel Counter Architecture for PerFlow Measurement Yi Lu Department of - PPT Presentation

lustanfordedu Andrea Montanari Departments of EE and Stats Stanford University montanarstanfordedu Balaji Prabhakar Departments of EE and CS Stanford University balajistanfordedu Sarang Dharmapurikar Nuova Systems Inc San Jose California sarangnuovas ID: 30272

lustanfordedu Andrea Montanari Departments

Share:

Link:

Embed:


Presentation Transcript

CounterBraids:ANovelCounterArchitectureforPer-FlowMeasurementYiLuDepartmentofEEStanfordUniversityyi.lu@stanford.eduAndreaMontanariDepartmentsofEEandStatsStanfordUniversitymontanar@stanford.eduBalajiPrabhakarDepartmentsofEEandCSStanfordUniversitybalaji@stanford.eduSarangDharmapurikarNuovaSystems,IncSanJose,Californiasarang@nuovasystems.comAbdulKabbaniDepartmentofEEStanfordUniversityakabbani@stanford.eduABSTRACTFine-grainednetworkmeasurementrequiresroutersandswitchestoupdatelargearraysofcountersatveryhighlinkspeed(e.g.40Gbps).AnaivealgorithmneedsaninfeasibleamountofSRAMtostoreboththecountersanda\row-to-counterassociationrule,sothatarrivingpacketscanupdatecorrespondingcountersatlinkspeed.Thishasmadeaccu-rateper-\rowmeasurementcomplexandexpensive,andmo-tivatedapproximatemethodsthatdetectandmeasureonlythelarge\rows.Thispaperrevisitstheproblemofaccurateper-\rowmea-surement.Wepresentacounterarchitecture,calledCounterBraids,inspiredbysparserandomgraphcodes.Inanut-shell,CounterBraids\compresseswhilecounting".Itsolvesthecentralproblems(counterspaceand\row-to-counteras-sociation)ofper-\rowmeasurementby\braiding"ahierarchyofcounterswithrandomgraphs.Braidingresultsindrasticspacereductionbysharingcountersamong\rows;andus-ingrandomgraphsgeneratedon-the-\rywithhashfunctionsavoidsthestorageof\row-to-counterassociation.TheCounterBraidsarchitectureisoptimal(albeitwithacomplexdecoder)asitachievesthemaximumcompressionrateasymptotically.Forimplementation,wepresentalow-complexitymessagepassingdecodingalgorithm,whichcanrecover\rowsizeswithessentiallyzeroerror.EvaluationonInternettracesdemonstratesthatalmostall\rowsizesarerecoveredexactlywithonlyafewbitsofcounterspaceper\row.CategoriesandSubjectDescriptorsC.2.3[ComputerCommunicationNetworks]:NetworkOperations-NetworkMonitoring;E.1[DataStructures]Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.SIGMETRICS'08,June2–6,2008,Annapolis,Maryland,USA.Copyright2008ACM978-1-60558-005-0/08/06...$5.00.GeneralTermsMeasurement,Algorithms,Theory,PerformanceKeywordsStatisticsCounters,NetworkMeasurement,MessagePass-ingAlgorithms1.INTRODUCTIONThereisanincreasingneedfor ne-grainednetworkmea-surementtoaidthemanagementoflargenetworks[14].Net-workmeasurementconsistsofcountingthesizeofalogicalentitycalled\\row",ataninterfacesuchasarouter.A\rowisasequenceofpacketsthatsatisfyacommonsetofrules.Forinstance,packetswiththesamesource(destination)ad-dressconstitutea\row.Measuring\rowsofthistypegivesthevolumeofupload(download)byauserandisusefulforaccountingandbillingpurposes.Measuring\rowswithaspeci c\row5-tupleinthepacketheadergivesmorede-tailedinformationsuchasroutingdistributionandtypesoftracinthenetwork.Suchinformationcanhelpgreatlywithtracengineeringandbandwidthprovisioning.Flowscanalsobede nedbypacketclassi cation.Forexample,ICMPEchopacketsusedfornetworkattacksforma\row.Measuringsuch\rowsisusefulduringandafteranattackforanomalydetectionandnetworkforensics.Currentlythereexistsnolarge-scalestatisticscounterar-chitecturethatisbothcheapandaccurate.Thisismainlyduetothelackofa ordablehigh-densityhigh-bandwidthmemorydevices.Toillustratetheproblem,theprocessingtimefora64-bytepacketata40-GbpsOC-768linkis12ns.ThisrequiresmemorieswithaccesstimemuchsmallerthanthatofcommerciallyavailableDRAM(whoseaccesstimeistensofnsec),andmakesitnecessarytoemploySRAMs.However,duetotheirlowdensity,largeSRAMsareexpen-siveanddiculttoimplementon-chip.Itis,therefore,es-sentialto ndacounterarchitecturethatminimizesmemoryspace.Therearetwomaincomponentsofthetotalspacere-quirement:1.Counterspace.Assumingthatamilliondistinct\rowsareobservedinaninterval1andusingone64-bitcounter 1OurOC-48(2:5Gbps)tracedatashowthatareabout900;000distinct\row5-tuplesina5-minuteinterval.On40-Gbpslinks,therecaneasilybeanexcessofamilliondis- per\row(astandardvendorpractice[20]),8MBofSRAMisneededforcounterspacealone.2.Flow-to-counterassociationrule.Thesetofactive\rowsvariesovertime,andthe\row-to-counterassociationruleneedstobedynamicallyconstructed.Forasmallnum-berof\rows,acontent-addressable-memory(CAM)isusedinmostapplications.However,thehighpowerconsumptionandheatdissipationofCAMsforbidtheiruseinrealisticscenarios,andSRAMhashtablesareusedtostorethe\row-to-counterassociationrule.Thisrequiresatleastanother10MBofSRAM.Thelargespacerequirementnotonlyconsiderablyin-creasesthecostoflinecards,butalsohindersacompactlayoutofchipsduetothelowdensityofSRAM.1.1PreviousApproachesThewideapplicabilityandinherentdicultyofdesign-ingstatisticscountershaveattractedtheattentionoftheresearchcommunity.Therearetwomainapproaches:(i)ExactcountingusingahybridSRAM-DRAMarchitecture,and(ii)approximatecountingbyexploitingtheheavy-tailnatureof\rowsizedistribution.Wereviewtheseapproachesbelow.Exactcounting.Shahet.al.[22]proposedandanalyzedahybridarchitecture,takingthe rststeptowardsanim-plementablelarge-scalecounterarray.Thearchitecturecon-sistsofshallowcountersinfastSRAManddeepcountersinslowDRAM.Thechallengeisto ndasimplealgorithmforupdatingtheDRAMcounterssothatnoSRAMcounterover\rowsinbetweentwoDRAMupdates.Thealgorithmanalyzedin[22]wassubsequentlyimprovedbyRamabhad-ranandVarghese[20]andZhaoet.al.[23].Thisreducedthealgorithmcomplexity,makingitfeasibletouseasmallSRAMwith5bitsper\rowtocount\rowsizesinpackets(notbytes).However,allthepapersabovesu erfromthefollowingdrawbacks:(i)deep(typically64bitsper\row)o -chipDRAMcountersareneeded,(ii)costlySRAM-to-DRAMupdatesarerequired,and(iii)the\row-to-counterassociationproblemisassumedtobesolvedusingaCAMorahashtable.Inparticular,theydonotaddressthe\row-to-counterassociationproblem.Approximatecounting.Tokeepcostacceptable,prac-ticalsolutionsfromtheindustryandacademicresearchei-thersacri cetheaccuracyorlimitthescopeofmeasure-ment.Forexample,Cisco'sNet\row[1]countsboth5-tuplesandper-pre x\rowsbasedonsampling,whichintroducesasigni cant9%relativeerrorevenforlarge\rowsandmoreerrorsforsmaller\rows[12].JuniperNetworksintroduced lter-basedaccounting[2]tocountalimitedsetof\rowspre-de nedmanuallybyoperators.The\sample-and-hold"solu-tionproposedbyEstanandVarghesein[12],whileachievinghighaccuracy,measuresonly\rowsthatoccupymorethan0:1%ofthetotalbandwidth.EstanandVarghese'sapproachintroducedtheideaofexploitingtheheavy-tail\rowsizedis-tribution:sinceafewlarge\rowsbringmostofthedata,itisfeasibletoquicklyidentifytheselarge\rowsandmeasuretheirsizesonly. tinct\row5-tuplesinashortobservationinterval.Or,formeasuringthefrequencyofpre xaccesses,oneneedsabout500;000counters,whichisthecurrentsizeofIPv4routingtables[20].Futureroutersmayeasilysupportmorethanamillionpre xes.1.2OurApproachThemaincontributionofthispaperisanSRAM-onlylarge-scalecounterarchitecturewiththefollowingfeatures:1.Flow-to-counterassociationusingasmallnumber(e.g.3)ofhashfunctions.2.Incrementalcompressionof\rowsizesaspacketsarrive;onlyasmallnumber(e.g.3)ofcountersareaccessedateachpacketarrival.3.Asymptoticoptimality.Wehaveprovedin[17]thatCounterBraids(CB),withanoptimal(butNP-hard)decoder,hasanasymptoticcompressionratematchingtheinformationtheoreticlimit.TheresultissurprisingsinceCBformsarestrictivefamilyofcompressors.4.Alinear-complexitymessagepassingdecodingalgo-rithmthatrecoversall\rowsizesfromcompressedcountswithessentiallyzeroerror.TotalspaceinCBneededforexactrecoveryisclosetotheoptimalcompressionof\rowsizes.5.Themessagepassingalgorithmisanalyzable,enablingthechoiceofdesignparametersfordi erenthardwarerequirement.Remark:WenotethatCBhasthedisadvantageofnotsupportinginstantaneousqueriesof\rowsizes.All\rowsizesaredecodedtogetherattheendofameasurementepoch.Weplantoaddressthisprobleminfuturework.Informaldescription.CounterBraidsisahierarchyofcountersbraidedviarandomgraphsintandem.Figure1(a)showsanaivecounterarchitecturethatstores ve\rowsizesincountersofequaldepth,whichhastoexceedthesizeofthelargest\row.Eachbitinacounterisshownasacircle.Theleastsigni cantbit(LSB)istheoneclosesttothe\rownode.Filledcirclesrepresenta1,andun lledcirclesa0.Thisstructureleadstoanenormouswastageofspacebecausethemajorityof\rowsaresmall.Figure1(b)showsCBforstoringthesame\rowsizes.Itisworthnotingthat:(i)CBhasfewer\moresigni cantbits"andtheyaresharedamongall\rows,and(ii)theexact\rowsizescanbeobtainedby\decoding"thebitpattenstoredinCB.Acomparisonofthetwo guresclearlyshowsagreatreductioninspace.1.3RelatedTheoreticalLiteratureCompressedSensing.TheideaofCounterBraidsisthe-maticallyrelatedtocompressedsensing[6,11],whosecentralinnovationissummarizedbythefollowingquote:Sincewecan\throwaway"mostofourdataandstillbeabletoreconstructtheoriginalwithnoperceptualloss(aswedowithubiquitoussound,imageanddatacompressionformats,)whycan'twedirectlymeasurethepartthatwillnotendupbeing\thrownaway"?[11]Forthenetworkmeasurementproblem,weobtainavec-torofcountervalues,c,viaCB,fromthe\rowsizesf.Iffhasasmallentropy,thevectorcoccupiesmuchlessspacethanf;itconstitutes\thepart(off)thatwillnotendupbeingthrownaway."Ano -chipdecodingalgorithmthenrecoversffromc.WhileCompressedSensingandCBare 2351 (a) 351321 (b)Figure1:(a)Asimplecounterstructure.(b)CounterBraids.( lledcircle=1,un lledcircle=0).thematicallyrelated,theyaremethodologicallyquitedif-ferent:CompressedSensingcomputesrandomlineartrans-formationsofthedataandusesLP(linearprogramming)reconstructionmethods;whereasCBusesamulti-layerednon-linearstructureandamessagepassingreconstructionalgorithm.Sparserandomgraphcodes.CounterBraidsismethod-ologicallyinspiredbythetheoryoflow-densityparitycheck(LDPC)codes[13,21].SeealsorelatedliteraturesonTor-nadocodes[18]andFountaincodes[4].Fromtheinforma-tiontheoreticperspective,thedesignofanecientcount-ingschemeandagood\rowsizeestimationisequivalenttothedesignofanecientcompressor,orasourcecode[8].However,thenetworkmeasurementproblemimposesastringentconstraintonsuchacode:eachtimethesizeofa\rowchanges(becauseanewpacketarrives),asmallnumberofoperationsmustbesucienttoupdatethecompressedin-formation.Thisisnotthecasewithstandardsourcecodes(suchastheLempel-Zivalgorithm),wherechangingasin-glebitinthesourcestreammaycompletelyalterthecom-pressedversion.We ndthattheclassofsourcecodesdualtoLDPCcodes[5]workwellunderthisconstraint;usingfeaturesofthesecodesmakesCBagood\incrementalcom-pressor."ThereisaprobleminusingthedesignofLDPCcodesfornetworkmeasurement:withtheheavy-taileddistribution,the\rowsizesareaprioriunbounded.Inthechannelcodinglanguage,thisisequivalenttousingacountablebutin niteinputalphabet.Asaresult,newideasaredevelopedforprovingtheachievabilityofoptimalasymptoticcompressionrate.Thefullproofiscontainedin[17]andwestatethetheoremintheappendixforcompleteness.Thelargealphabetsizealsomakesiterativemessagepass-ingdecodingalgorithms[15],suchasBeliefPropagation,highlycomplextoimplement,asBPpassesprobabilitiesratherthannumbers.Inthispaper,wepresentanovelmes-sagepassingdecodingalgorithmoflowcomplexitythatiseasytoimplement.Thesub-optimalityofthemessagepass-ingalgorithmnaturallyrequiresmorecounterspacethantheinformationtheoreticlimit.Wecharacterizethemini-mumspacerequiredforzeroasymptoticdecodingerrorus-ing\densityevolution"[21].ThespacerequirementcanbefurtheroptimizedwithrespecttothenumberoflayersinCounterBraids,andthedegreedistributionofeachlayer.Theoptimizedspaceisclosetotheinformationtheoreticlimit,enablingCBto tintosmallSRAM.Count-MinSketch.LikeCounterBraids,theCount-Minsketch[7]fordatastreamapplicationsisalsoarandomhash-basedstructure.WithCount-Min,each\rowhashestoandupdatesdcounters;theminimumvalueofthedcountersisretrievedasthe\rowestimate.TheCount-Minsketchprovidesprobabilisticguaranteesfortheestimationerror:withatleast1probability,theestimationerrorislessthanjfj1,wherejfj1isthesumofall\rowsizes.Tohavesmalland,thenumberofcountersneedstobelarge.TheCount-Minsketchisdi erentfromCounterBraidsinthefollowingways:(a)Thereisno\braiding"ofcounters,hencenocompression.(b)TheestimationalgorithmfortheCount-Minsketchisone-step,whereasitisiterativeforCB.Infact,comparingtheCount-Minalgorithmtoourrecon-structionalgorithmonaone-layerCB,itiseasytoseethattheestimatebyCount-Minisexactlytheestimateafterthe rstiterationofouralgorithm.Thus,CBperformsatleastaswellastheCount-Minalgorithm.2(c)Ourreconstruc-tionalgorithmdetectserrors.Thatis,itcandistinguishthe\rowswhosesizesareincorrectlyestimated,andproduceanupperandlowerboundofthetruevalue;whereastheCount-Minsketchonlyguaranteesanover-estimate.(d)CBneedstodecodeallthe\rowsizesatonce,unliketheCount-Minalgorithmwhichcanestimateasingle\rowsize.Thus,Count-MinisbetterathandlingonlinequeriesthanCB.StructurallyrelatedtoCounterBraids(randomhashingof\rowsintocountersandarecoveryalgorithm)istheworkofKumaret.al.[16].Thegoalofthatworkistoestimatethe\rowsizedistributionandnottheactual\rowsizes,whichisouraim.InSection2,wede nethegoalsofthispaperandoutlineoursolutionmethodology.Section3describestheCounterBraidsarchitecture.Themessagepassingdecodingalgo-rithmisdescribedinSection4andanalyzedinSection5.Section6exploresthechoiceofparametersformulti-layerCB.ThealgorithmisevaluatedusingtracesinSection7.WediscussimplementationissuesinSection8andoutlinefurtherworkinSection9.2.PROBLEMFORMULATIONWedividetimeintomeasurementepochs(e.g.5minutes).Theobjectiveistocountthenumberofpacketsper\rowforallactive\rowswithinameasurementepoch.Wedonotdealwiththebyte-countingprobleminthispaperduetospacelimitation,butthereisnoconstraintinusingCounterBraidsforbyte-counting.Goals:AsmentionedinSection1,themainproblemswewishtoaddressare:(i)compacting(oreliminating)thespaceusedby\row-to-counterassociationrule,and(ii)sav-ingcounterspaceandincrementallycompressingthecounts. 2Thisissimilartothebene tofTurbocodesoverconven-tionalsoft-decisiondecodingalgorithmsandillustratesthepowerofthe\Turboprinciple." Additionally,wewouldlike(iii)alow-complexityalgorithmtoreconstruct\rowsizesattheendofameasurementepoch.Solutionmethodology:Correspondingtothegoals,we(i)useasmallnumberofhashfunctions,(ii)braidthecoun-ters,and(iii)usealinear-complexitymessage-passingalgo-rithmtoreconstruct\rowsizes.Inparticular,byusingasmallnumberofhashfunctions,weeliminatetheneedforstoringa\row-to-counterassociationrule.Performancemeasures:(1)Space:measuredinnumberofbitsper\rowoccupiedbycounters.Wedenoteitbyr(tosuggestcompressionrateasintheinformationtheoryliterature.)Notethatthenumberofcountersisnotthecorrectmeasureofcompressionrate;rather,itisthenumberofbits.(2)Reconstructionerror:measuredasthefractionof\rowswhosereconstructedvalueisdi erentfromthetruevalue:Perr1 nnXi=1Ifbfi=fig;wherenisthetotalnumberof\rows,bfiistheestimatedsizeof\rowiandfithetruesize.Iistheindicatorfunc-tion,whichreturns1iftheexpressioninthebracketistrueand0otherwise.Wechosethismetricsincewewantexactreconstruction.(3)Averageerrormagnitude:de nedastheratioofthesumofabsoluteerrorsandthenumberoferrors:Em=Pijfibfij PiI(fi=bfi):Itmeasureshowbiganerroriswhenanerrorhasoccurred.Thestatementofasymptoticoptimalityintheappendixyieldsthatitispossibletokeepspaceequaltothe\row-sizeentropy,andhavereconstructionerrorgoingto0asthenumberof\rowsgoestoin nity.Bothanalysis(Section5)andsimulations(Section7)showthatwithourlow-complexitymessagepassingdecodingal-gorithm,wecankeepspaceclosetothe\row-sizeentropyandobtainessentiallyzeroreconstructionerror.Inaddi-tion,thealgorithmo ersagraciousdegradationoferrorwhenspaceisfurtherreduced,evenbelowthe\row-sizeen-tropy.Althoughreconstructionerrorbecomessigni cant,averageerrormagnituderemainssmall,whichmeansthatmost\row-sizeestimatesareclosetotheirtruevalues.3.OURSOLUTIONTheoverallarchitectureofoursolutionisshowninFigure2.EacharrivingpacketupdatesCounterBraidsinon-chipSRAM.Thisconstitutestheencodingstageifweviewmea-surementascompression.Attheendofameasurementepoch,thecontentofCounterBraids,i.e.,thecompressedcounts,aretransferredtoanoineprocessingunit,suchasaPC.Areconstructionalgorithmthenrecoversthelistof\rowID,size&#x-3.6;⑹pairs.WedescribeCBinSection3.1andspecifythemappingthatsolvesthe\row-to-counterassociationprobleminSec-tion3.2.Wedescribetheupdatingscheme,ortheon-chipencodingalgorithm,inSection3.3,leavingthedescriptionofthereconstructionalgorithmtoSection4. Figure2:SystemDiagram.3.1CounterBraidsCounterBraidshasalayeredstructure.Thel-thlayerhasmlcounterswithadepthofdlbits.LetthetotalnumberoflayersbeL.Inpractice,L=2isusuallysucientaswillbeshowninSection6.Figure3illustratesthecasewhereL=2.Foracompletedescriptionofthestructure,weleaveLasaparameter. Figure3:Two-layerCounterBraidswithtwohashfunc-tionsandstatusbits.WewillshowinlatersectionsthatwecanuseadecreasingnumberofcountersineachlayerofCB,andstillbeabletorecoverthe\rowsizescorrectly.Theideaisthatgivenaheavy-taildistributionfor\rowsizes,themoresigni cantbitsinthecountersarepoorlyutilized;sincebraidingallowsmoresigni cantbitstobesharedamongall\rows,areducednumberofcountersinthehigherlayerssuce.Figure3alsoshowsanoptionalfeatureofCB,thestatusbits.Astatusbitisanadditionalbitona rst-layercounter.Itissetto1afterthecorrespondingcounter rstover\rows.CounterBraidswithoutstatusbitsistheoreticallysucient:theasymptoticoptimalityresultintheappendixisshownwithoutstatusbits,assumingahigh-complexityoptimalde-coder.However,inpracticeweusealow-complexitymes-sagepassingdecoder,andtheparticularshapeofthenet-worktracdistributionisbetterexploitedwithstatusbits.Statusbitsoccupyadditionalspace,butprovideusefulin-formationtothemessage-passingdecodersothatthenum-berofsecond-layercounterscanbefurtherreduced,yield-ingafavorabletradeo inspace.Statusbitsaretakenintoaccountwhencomputingthetotalspace;inparticular,it guresintheperformancemeasure,r,\spaceinnumberof bitsper\row."InCBwithmorethantwolayers,everylayerexceptthelastwillhavecounterswithstatusbits.3.2TheRandom(Hash)MappingsWeusethesamerandommappingintwosettings:(i)between\rowsandthe rst-layercounters,and(ii)betweentwoconsecutivelayersofcounters.ThedashedarrowsinFigure3illustrateboth(i)and(ii)(whichisbetweenthe rstandsecondlayercounters.)Considertherandommappingbetween\rowsandthelayer-1counters.Foreach\rowID,weapplykpseudo-randomhashfunctionswithacommonrangef0;;m11g,wherem1isthenumberofcountersinlayer1,asillustratedinFig-ure3(withk=2.)Themappinghasthefollowingfeatures:1.Itisdynamicallyconstructedforavaryingsetofac-tive\rows,byapplyinghashfunctionsto\rowIDs.Inotherwords,nomemoryspaceisneededtodescribethemappingexplicitly.Thestorageforthe\row-to-counterassociationissimplythesizeofdescriptionofthekhashfunctionsanddoesnotincreasewiththenum-berof\rowsn.2.Thenumberofhashfunctionskissettoasmallcon-stant(e.g.3).Thisallowscounterstobeupdatedwithonlyasmallnumberofoperationsatapacketarrival.Remark.Notethatthemappingdoesnothaveanyspecialstructure.Inparticular,itisnotbijective.Thisnecessi-tatestheuseofareconstructionalgorithmtorecoverthe\rowsizes.Usingk�1addsredundancytothemappingandmakesrecoverypossible.However,therandommappingdoesmorethansimplifyingthe\row-to-counterassociation.Infact,itperformsthecompressionof\rowsizesintocountervaluesandreducescounterspace.Nextconsidertherandommappingbetweentwoconsec-utivelayersofcounters.Foreachcounterlocation(intherangef0;;ml1g)inthel-thlayer,weapplykhashfunctionstoobtainthecorresponding(l+1)-thlayercounterlocations(intherangef0;;ml+11g).ItisillustratedinFigure3withk=2.Theuseofhashfunctionsenablesustoimplementthemappingwithoutextracircuitsinthehardware;andtherandommappingfurthercompressesthecountsinlayer-2counters.3.3Encoding:TheUpdatingAlgorithmTheinitializationandupdateproceduresofatwo-layerCounterBraidswith2hashfunctionsateachlayerarespec-i edinExhibit1.Theproceduresincludeboththegener-ationofrandommappingusinghashfunctionsandtheup-datingscheme.Whenapacketarrives,bothcountersits\rowlabelhashesintoareincremented.Andwhenacounterinlayer1over\rows,bothcountersinlayer2ithashesintoareincrementedby1,likeacarry-over.Theover\rowingcounterisresetto0andthecorrespondingstatusbitissetto1.Itisevidentfromtheexhibitthattheamountofupdat-ingrequiredisverysmall.Yetaftereachupdate,thecoun-tersstoreacompressedversionofthemostup-to-date\rowsizes.Theincrementalnatureofthiscompressionalgorithmismadepossiblewiththeuseofrandomsparselinearcodes,whichweshallfurtherexploitatthereconstructionstage. Exhibit1:TheUpdateAlgorithm 1:Initialize2:forlayerl=1to23:forcounteri=1toml4:counters[l][i]=05:Update6:Uponthearrivalofapacketpkt7:idx1=hash-function1(pkt);8:idx2=hash-function2(pkt);9:counters[1][idx1]=counter[1][idx1]+1;10:counters[1][idx2]=counter[1][idx2]+1;11:ifcounters[1][idx1]over\rows,12:Updatesecond-layercounters(idx1);13:ifcounters[1][idx2]over\rows,14:Updatesecond-layercounters(idx2)15:Updatesecond-layercounters(idx)16:statusbit[1][idx]=1;17:idx3=hash-function3(idx);18:idx4=hash-function4(idx);19:counters[2][idx3]=counter[2][idx3]+1;20:counters[2][idx4]=counter[2][idx4]+1 Theupdateofthesecond-layercounterscanbepipelined.Itcanbeexecutedtogetherwiththenextupdateofthe rst-layercounters.Ingeneral,pipeliningcanbeusedforCBwithmultiplelayers. Figure4:Atoyexampleforupdating.Numbersnextto\rownodesarecurrent\rowsizes.Dottedlinesindi-catehashfunctions.Thicklinesindicatehashfunctionsbeingcomputedbyanarrivingpacket.The\rowwithanarrivingpacketisindicatedbyanarrow.Figure4illustratestheupdatingalgorithmwithatoyex-ample.(a)showstheinitialstateofCBwithtwo\rows.In(b),anew\rowarrives,bringingthe rstpacket;alayer-1counterover\rowsandupdatestwolayer-2counters.In(c),apacketofanexisting\rowarrivesandnoover\rowoccurs.In(d),anotherpacketofanexisting\rowarrivesandanotherlayer-1counterover\rows. 4.MESSAGEPASSINGDECODERThesparsityoftherandomgraphs3inCBopensthewaytousinglow-complexitymessagepassingalgorithmsforre-constructionof\rowsizes,butthedesignofsuchanalgo-rithmisnotobvious.InthecaseofLDPCcodes,messagepassingdecodingalgorithmsholdthepromiseofapproach-ingcapacitywithunprecedentedlylowcomplexity.However,thealgorithmsusedincoding,suchasBeliefPropagation,haveincreasingmemoryrequirementasthealphabetsizegrows,sinceBPpassesprobabilitydistributionsinsteadofsinglenumbers.Wedevelopanovelmessagepassingalgo-rithmthatissimpletoimplementoncountablealphabets.4.1OneLayerConsidertherandommappingbetween\rowsandthe rst-layercounters.Itisabipartitegraphwith\rownodesontheleftandcounternodesontheright,asshowninFigure5.Anedgeconnects\rowiandcounteraifoneofthekhashfunctionsmaps\rowitocountera.Thevectorfdenotes\rowsizesandcdenotescountervalues.ca=Xi2@afi;where@adenotesallthe\rowsthathashintocountera.Theproblemistoestimatefgivenc. Figure5:Messagepassingonabipartitegraphwith\rownodes(circles)andcounternodes(rectangles.)Messagepassingalgorithmsareiterative.Inthetthiter-ationmessagesarepassedfromall\rownodestoallcounternodesandthenbackinthereversedirection.Amessagegoesfrom\rowitocountera(denotedbyia)andviceversa(de-notedbyai)onlyifnodesiandaareneighbors(connectedbyanedge)onthebipartitegraph.OuralgorithmisdescribedinExhibit2.Themessagesia(0)areinitializedto0,althoughanyinitialvaluelessthantheminimum\rowsize,min,willworkjustaswell.Theinterpretationofthemessagesisasfollows:aiconveyscountera'sguessof\rowi'ssizebasedontheinformationitreceivedfromneighboring\rowsotherthan\rowi.Con-versely,iaistheguessby\rowiofitsownsize,basedontheinformationitreceivedfromneighboringcountersotherthancountera.Remark1.Sinceia(0)=0,ai(1)=caandbfi(1)=minafcag; 3EachrandommappinginCBisarandombipartitegraphwithedgesgeneratedbythekhashfunctions.Itissparsebecausethenumberofedgesislinearinthenumberofnodes,asopposedtoquadraticforacompletebipartitegraph. Exhibit2:TheMessagePassingDecodingAlgorithm 1:Initialize2:min=minimum\rowsize;3:ia(0)=08iand8a;4:ca=athcountervalue5:Iterations6:foriterationnumbert=1toT7:ai(t)=maxncaPj=ija(t1);mino;8:ia(t)=minb=abi(t)iftisodd,maxb=abi(t)iftiseven.9:FinalEstimate10:bfi(T)=minafai(T)gifTisodd,maxafai(T)gifTiseven. Figure6:Thedecodingalgorithmover4iterations.Numbersinthetopmost gurearetrue\rowsizesandcountervalues.Inaniteration,numbersnexttoanodearemessagesonitsoutgoingedges,fromtoptobot-tom.Eachiterationinvolvesmessagesgoingfrom\rowstocountersandbackfromcountersto\rows.whichispreciselytheestimateoftheCount-Minalgorithm.Thus,theestimateofCount-Ministheestimateofourmessage-passingalgorithmafterthe rstiteration.Remark2.Thedistinctionbetweenoddandevenitera-tionsatline8and10isduetothe\anti-monotonicityprop-erty"ofthemessage-passingalgorithm,tobediscussedinSection5.Remark3.Itturnsoutthatthealgorithmremainsun-changediftheminimumormaximumatline8isoverallincomingmessages,thatis,ia(t)=minbbi(t)iftisodd,maxbbi(t)iftiseven.Thechangewillsavesomecomputationsinimplementation.Theproofofthisfactandensuinganalyticalconsequencesisdeferredtoforthcomingpublications.Inthispaper,westicktothealgorithminExhibit2. Toyexample.Figure6showstheevolutionofmessagesover4iterationsonatoyexample.Inthisparticularexam-ple,all\rowsizesarereconstructedcorrectly.Notethatweareusingdi erentdegreesatsome\rownodes.Ingeneral,thisgivespotentiallybetterperformancethanall\rownodeshavingthesamedegree,butwewillsticktothelatterinthispaperforitseaseofimplementation.The\rowestimatesateachiterationarelistedinTable1.Allmessagesconvergein4iterationsandtheestimatesatIteration1(secondcolumn)istheCount-Minestimate.iteration 01234 bf1 034111 bf2 034111 bf3 032323232Table1:Flowestimatesateachiteration.AllmessagesconvergeafterIteration3.4.2Multi-layerMulti-layerCounterBraidsaredecodedrecursively,onelayeratatime.Itisconceptuallyhelpfultoconstructanewsetof\rowsflforlayer-lcountersbasedonthecountervaluesatlayer(l1).Thepresenceofstatusbitsa ectsthede nitionoffl. Figure7:Withoutstatusbits,\rowsinf2haveaone-to-onemaptoallcounterinc1. Figure8:Withstatusbits,\rowsinf2haveaone-to-onemaptoonlycountersthathaveover\rown(whosestatusbitsaresetto1).Figure7illustratestheconstructionoff2whentherearenostatusbits.Thevectorf2hasaone-to-onemaptocoun-tersinlayer1,anda\rowsizeinf2equalsthenumberoftimesthecorrespondingcounterhasover\rown,withtheminimumvalue0.Figure8illustratestheconstructionoff2whentherearestatusbits.Thevectorf2nowhasaone-to-onecorrespon-dencewithonlythosecountersinlayer1thathaveover-\rown;thatis,counterswhosestatusbitsaresetto1.Thenew\rowsizeisstillthenumberoftimesthecorrespondingcounterover\rows,butinthiscase,theminimumvalueis1.Itisclearfromthe gurethattheuseofstatusbitse ec-tivelyreducesthenumberof\rownodesinlayer2.Hence,fewercountersareneededinlayer2forgooddecodability.Thisreductionincounterspaceatlayer2tradeso withtheadditionalspaceneededforstatusbitsthemselves!AsweshallseeinSection6,whenthenumberoflayersinCBissmall,thetradeo favorstheuseofstatusbits.The\rowsizesaredecodedrecursively,startingfromthetopmostlayer.Forexample,afterdecodingthelayer-2\\rows,"weaddtheirsizes(theamountofover\rowfromlayer-1coun-ters)tothevaluesoflayer-1counters.Wethenusethenewvaluesoflayer-1counterstodecodethe\rowsizes.DetailsofthealgorithmarepresentedinExhibit3. Exhibit3:TheMulti-layerAlgorithm 1:forl=Lto12:constructthegraphforlthlayerasinFigure7ifwithoutstatusbits;asinFigure8ifwithstatusbits;3:decodeflfromclasinExhibit24:cl1=cl1+fl2l1wheredl1isthecounterdepthinbitsatlayer(l1) 5.SINGLE-LAYERANALYSISThedecodingalgorithmworksonelayeratatime;hence,we rstanalyzethesingle-layermessagepassingalgorithmanddetermineitsraterandreconstructionerrorprobabilityPerr.Thisanalysislaysthefoundationforthedesignofmulti-layerCounterBraids,tobepresentedinSection6.Sinceallcountersinlayer1havethesamedepthd1,averyrelevantquantityfortheanalysisisthenumberofcountersper\row: m=n;wheremisthenumberofcountersandnisthenumberof\rows.Thecompressionrateinbitsper\rowisgivenbyr= d1.ThebipartitegraphinFigure5willbethefocusofstudy,asitspropertiesdeterminetheperformanceofthealgorithm.Lemma1.TogglingProperty.Ifia(t1)fiforeveryianda,thenai(t)fiandia(t)fi.Conversely,ifia(t1)fiforeveryianda,thenai(t)fiandia(t)fi.Theproofofthislemmafollowssimplyfromthede nitionofandandisomitted.Lemma2.Anti-monotonicityProperty.Ifand0aresuchthatforeveryianda,ia(t1)0ia(t1)fi,thenia(t)0ia(t)fi.Consequently,sincebf(0)=0,bf(2t)fcomponent-wiseandbf(2t)iscomponent-wisenon-decreasing.Similarlybf(2t+1)fandiscomponent-wisenon-increasing. Proof.Itfollowsfromline7ofExhibit2that,ifia(t1)0ia(t1)fi,thenai(t)0ai(t)fi.4Fromthisandthede nitionsofandbfatlines8and10ofExhibit2,therestofthelemmafollows. Theabovelemmasgiveapowerfulconclusion:Thetruevalueofthe\row-sizevectorissandwichedbetweenmonoton-icallyincreasinglowerboundsandmonotonicallydecreasingupperbounds.Thequestion,therefore,is:Convergence:Whendoesthesandwichclose?Thatis,underwhatconditionsdoesthemessagepassingalgorithmconverge?Wegivetwoanswers.The rstisgeneral,notrequiringanyknowledgeofthe\row-sizedistribution.Thesecondusesthe\row-sizedistribution,butgivesamuchbetteranswer.Indeed,oneobtainsanexactthresholdfortheconvergenceofthealgorithm:For � thealgorithmconverges,andfor itfailstoconverge(i.e.thesandwichdoesnotclose.)5.1MessagePassingonTreesDefinition1.Agraphisaforestifforallnodesinthegraph,thereexistsnopathofnon-vanishinglengththatstartsandendsatthesamenode.Inotherwords,thegraphcon-tainsnoloops.Suchagraphisatreeifitisconnected.Fact1.Considerabipartitegraphwithn\rownodesandm= ncountersnodes,whereeach\rownodeconnectstokuniformlysampledcounternodes.Itisaforestwithhighprobabilityi k(k1)[19].Assumethebipartitegraphisaforest.Sincethe\rownodeshavedegreek&#x-391;&#x.690;1,theleavesofthetreeshavetobecounternodes.Theorem1.Forany\rownodeibelongingtoatreecom-ponentinthebipartitegraph,themessagepassingalgorithmconvergestothecorrect\rowestimatesaftera nitenumberofiterations.Inotherwords,foreverya,ai(t),ia(t)andbfi(t)allcoincidewithfiforalltlargeenough. Figure9:ThetreeTairootedatthedirectededgea!i.ItsdepthisDai=2.ProofForsimplicityweproveconvergenceforai(t),astheconvergenceofotherquantitieseasilyfollows.Giventhedirectededgea!i,considerthesubtreeTairootedata!iobtainedbycuttingallthecounternodesadjacenttoibuta,cf.Figure9.Clearlyai(t)onlydependsonthecountervaluesinsideTai,andwerestrictourattentiontothissubtree.LetDaidenotethedepthofTai.WeshallprovebyinductiononDaithatai(t)=fiforanytDai. 4Notethatweimplicitlyassumethattisoddtobeconsis-tentwiththede nitionof()atline8.IfDai=1,thisistriviallytrue:atanytimeai(t)=caandsinceca=fi,thethesisfollows.AssumenowthatthethesisholdsforalldepthsuptoD,andconsiderDai=D+1.Letjbeoneofthe\rowsinTaithathashestocountera,andletbdenoteoneoftheothercounterstowhichitcontributes,cf.Figure9.SincethedepthofthesubtreeTbjisatmostD,bytheinductionhypothesis,bj(t)=fjforanytD.ConsidernowtD+1.Fromthemessagesde nedinExhibit2andthepreviousobservation,itfollowsthatai(t)=caPj=ifj=fiasclaimed. Unfortunately,theuseoftheabovetheoremforCBre-quires k(k1),whichleadstoanenormouswastageofcounters.Wewillnowassumeknowledgeofthe\row-sizedistributionanddramaticallyreduce .Wewillworkwithsparserandomgraphsthatarenotforests,butrathertheywillhavealocallytree-likestructure.5.2SparseRandomGraphItturnsoutthatweareabletocharacterizetherecon-structionerrorprobabilityatt-thiterationofthealgorithmmoreprecisely.Aniceobservationenablesustousetheideaofdensityevolution,developedincodingtheory[21],tocomputetheerrorprobabilityrecursivelyinthelargenlimit.Duetospacelimitation,weareunabletofullyde-scribetheideasofthissection.Wewillbecontenttostatethemaintheoremandmakesomeusefulremarks.Considerabipartitegraphwithn\rownodesandm= ncounternodes,whereeach\rownodeconnectstokuniformlysampledcounternodes.Let\r(x)=1Xi=1e\r(\rx)i1 (i1)!:where\r=nk=mistheaveragedegreeofacounternode.ThedegreedistributionofacounternodeconvergestoaPoissondistributionasn!1,and\r(x)isthegeneratingfunctionforthePoissondistribution.Assumethatwearegiventhe\rowsizedistributionandlet=P(fi�min):Recallthatministheminimumvalueof\rowsizes.Letf(\r;x)=f1\r(1[1\r(1x))]k1)gk1;and\rsupf\r2R:x=f(\r;x)hasnosolution8x2(0;1]g:Theorem2.TheThreshold.Wehave m n=k \rsuchthatinthelargenlimit(i)If � ,bf(2t)"fandbf(2t+1)#f.(ii)If ,thereexistsapositiveproportionof\rowssuchthatbfi(2t)bfi(2t+1)forallt.Thus,some\rowsarenotcorrectlyreconstructed.5 5Intheeventofbfi(2t)bfi(2t+1),weknowthatanerrorhasoccurred.Moreover,bfi(2t)lowerboundsandbfi(2t+1)upperboundsthetruevaluefi. 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 y=x y= f(g,x) Figure10:Densityevolutionasawalkbetweentwocurves.Remark1.Thecharacterizationofthethreshold largelydependsonthelocallytreelikestructureofthesparserandomgraph.Moreprecisely,itmeansthatthegraphcontainsno nite-lengthloopsasn!1.Basedonthis,thedensityevolutionprinciplerecursivelycomputestheer-rorprobabilityaftera nitenumberofiterations,duringwhichallincomingmessagesatanynodeareindependent.Withsomeobservationsspeci ctothisalgorithm,weobtainf(\r;x)astherecursion.Remark2.Thede nitionof\rcanbeunderstoodvi-suallyusingFigure10.Therecursivecomputationofer-rorprobability6correspondstoawalkbetweenthecurvey=f(\r;x)andtheliney=x,wheretwoiterations(evenandodd)correspondtoonestep.If\r\r,y=f(\r;x)isbelowy=x,andthewalkcontinuesallthewayto0,cf.Figure10.Thismeansthatthereconstructionerroris0.If\r&#x-351;&#x.540;\r,y=f(\r;x)intersectsy=xatpointsabove0,andthewalkendsatanon-zerointersectionpoint.Thismeansthatthereisapositiveerrorforanynumberofiterations.Remark3.Theminimumvalueof =p canbeob-tainedafteroptimizingoveralldegreedistributions,includ-ingirregularones.Forthespeci cbipartitegraphinCB,where\rownodeshaveregulardegreekandcounternodeshavePoissondistributeddegrees,weobtain\r=1 p ; =2p ;fork=2.Thevaluesof\rand fordi erentkarelistedinTable2forP(fi�x)=x1:5:Theoptimumvaluep =0:595inthiscase.Thevaluek=3achievesthelowest among2k7,whichis18%morethantheoptimum.k 234567 \r 1:694:235:416:216:827:32  1:180:710:740:800:880:96Table2:Single-layerratefor2k7.P(fi�x)=x1:5:6.MULTI-LAYERDESIGNGivenaspeci c\rowsizedistribution(oranupperboundontheempiricaltaildistribution),wehaveageneralalgo-rithmthatoptimizesthenumberofbitsper\rowinCounter 6Moreprecisely,itreferstotheprobabilitythatanoutgoingmessageisinerror.Braidsoverthefollowingparameters:(1)numberoflayers,(2)numberofhashfunctionsineachlayer,(3)depthofcoun-tersineachlayerand(4)theuseofstatusbits.Wepresentbelowtheresultsfromtheoptimization. 1 2 3 5 10 50 number of layers, LSpace in bits per flow, r a=1.5 a=1.1 a=0.6 Figure11:Optimizedspaceagainstnumberoflayers.(i)Twolayersareusuallysucient.Figure11showsthedecreaseoftotalspace(numberofbitsper\row)asthenumberoflayersincreases,forpower-lawdistributionsP(fi�x)=x with =1:5;1:1and0:6:Fordistributionswithrelativelylighttails,suchas =1:5or1:1,twolayersaccomplishthemajorpartofspacereduction;whereasforheaviertails,suchas =0:6,threelayershelpreducespacefurther.Notethatthedistributionwith =0:6hasveryheavytails.Forinstance,the\rowdistributionsfromrealInternettraces,suchasthoseplottedin[16],has 2.Hencetwolayerssuceformostnetworktrac.(ii)3hashfunctionsisoptimalfortwo-layerCB.Weoptimizedtotalspaceoverthenumberofhashfunctionsineachlayerforatwo-layerCB.Using3hashfunctionsinbothlayersachievestheminimumspace.Fixingk=3andusingthetracdistribution,wecan nd accordingtoTheorem2.Thenumberofcountersinlayer1ism1= n,wherenisthenumberof\rows.(iii)Layer-1counterdepthandnumberoflayer-2counters.Thereisatradeo betweenthedepthoflayer-1countersandthenumberoflayer-2counters,sinceshallowlayer-1countersover\rowmoreoften.Formostnetworktracwith 1:1,4or5bitsinlayer1suce.Fordistributionswithheaviertails,suchas =1,theoptimaldepthis7to8bits.Sincelayer-2countersaremuchdeeperthanlayer-1counters,itisusuallyfavorabletohaveatleastoneorderfewercountersinlayer2.(iv)Statusbitsarehelpful.Weconsideratwo-layerCBandcomparetheoptimizedratewithandwithoutstatusbits.Sizingsthatachievethemin-imumratewith =1:5andmaximum\rowsize13aresummarizedbelow.Hererdenotesthetotalnumberofbitsper\row. idenotesthenumberofcountersper\rowinthei-thlayer.d1denotesthenumberofbitsinthe rstlayer,(inthetwo-layercase,d2=maximum\rowsized1).kidenotesthenumberofhashfunctionsinthei-thlayer.CBwithstatusbitsachievessmallertotalspace,r.Similarre-sultsareobservedwithothervaluesof andmaximum\rowsize. r 1 2d1k1k2 statusbit 4:130:710:065433 nostatusbit 4:660:710:14533Wesummarizetheaboveasthefollowingrulesofthumb.1.Useatwo-layerCBwithstatusbitsand3hashfunc-tionsateachlayer.2.Empiricallyestimate(orguessbasedonhistoricaldata)theheavy-tailexponent andthemax\rowsize.3.Compute accordingtoTheorem2.Setm1= nandm2=0:1 n.4.Use5-bitcountersatlayer1for 1:1,and8-bitcountersfor 1:1.Usedeepenoughcountersatlayer2sothatthelargest\rowisaccommodated(ingeneral,64-bitcountersatlayer-2aredeepenough).7.EVALUATIONWeevaluatetheperformanceofCounterBraidsusingbothrandomlygeneratedtracesandrealInternettraces.InSection7.1wegeneratearandomgraphandarandomsetof\rowsizesforeachrunofexperiment.Weusen=1000andareabletoaveragethereconstructionerror,Perr,andtheaverageerrormagnitude,Em,overenoughroundssothattheirstandarddeviationislessthan1=10oftheirmagnitude.InSection7.2weuse5-minutesegmentsoftwoone-hourcontiguousInternettracesandgeneratearandomgraphforeachsegment.Wereportresultsfortheentiredurationoftwohours.ThereconstructionerrorPerristhetotalnumberoferrorsdividedbythetotalnumberof\rows,andtheav-erageerrormagnitudeEmmeasureshowbigthedeviationfromtheactual\rowsizeisprovidedanerrorhasoccurred.7.1PerformanceFirst,wecomparetheperformanceofone-layerandtwo-layerCB.Weuse1000\rowsrandomlygeneratedfromthedistributionP(fi&#x-3.6;┠x)=x1:5,whoseentropyisalittlelessthan3bits.Wevarythetotalnumberofbitsper\rowinCBandcomputePerrandEm.Inallexperiments,weuseCBwith3hashfunctions.Forthetwo-layerCB,weuse4-bitdeeplayer-1counterswithstatusbits.TheresultsareshowninFigure12.Thepointslabelled1-layerand2-layerthresholdrespec-tivelyareasymptoticthresholdcomputedusingdensityevo-lution.Weobservethatwith1000\rows,thereisasharpdecreaseinPerraroundthisasymptoticthreshold.Indeed,theerrorislessthan1in1000whenthenumberofbitsper\rowis1bitabovetheasymptoticthreshold.Withalargernumberof\rows,thedecreasearoundthresholdisexpectedtobeevensharper.Similarly,onceabovethethreshold,theaverageerrormag-nitudeforboth1-layerand2-layerCounterBraidsiscloseto1,theminimummagnitudeofanerror.Whenbelowthethreshold,theaverageerrormagnitudeincreasesonlylin-earlyasthenumberofbitsdecreases.At1bitper\row,wehave4050%\rowsincorrectlydecoded,buttheaverageer-rormagnitudeisonlyabout5.Thismeansthatmany\rowestimatesarenotfarfromthetruevalues.Together,weseethatthe2-layerCBhasmuchbetterperformancethanthe1-layerCBwiththesamespace.Asweincreasethenumberoflayers,theasymptoticthreshold 0 1 2 3 4 5 6 7 8 9 10-4 10-3 10-2 10-1 100 bits per flowReconstruction Error, Perr one layer two layers entropy 2-layer threshold 1-layerthreshold 0 1 2 3 4 5 6 7 8 9 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 bits per flowAverage Error Magnitude, Em Figure12:Performanceoveravaryingnumberofbitsper\row.willmoveclosertoentropy.However,weobservethatthe2-layerCBhasalreadyaccomplishedmostofthegain. 0 5 10 15 20 25 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 proportion of flows estimated incorrectlynumber of iterations below threshold at threshold above threshold Count-Min Figure13:Performanceovernumberofiterations.NotethatPerrforaCount-MinsketchwiththesamespaceasCBishigh.Next,weinvestigatethenumberofiterationsrequiredtoreconstructthe\rows.Figure13showstheremainingpro-portionofincorrectlydecoded\rowsasthenumberofiter-ationsincreases.Theexperimentsarerunfor1000\rowswiththesamedistributionasabove,onaone-layerCounterBraids.Thenumberofbitsper\rowischosentobebelow,atandabovetheasymptoticthreshold.Aspredictedbyden-sityevolution,Perrdecreasesexponentiallyandconvergesto0atorabovetheasymptoticthreshold,andconvergestoapositiveerrorwhenbelowthreshold.Inthisexperiment,10iterationsaresucienttorecovermostofthe\rows. 7.2TraceSimulationWeusetwoOC-48(2.5Gbps)one-hourcontiguoustracesataSanJoserouter.Trace1wascollectedonWednesday,Jan15,2003,10amto11am,hencerepresentativeofweek-daytrac.Trace2wascollectedonThurApr24,2003,12amto1am,hencerepresentativeofnight-timetrac.Wedivideeachtraceinto125-minutesegments,correspondingtoameasurementepoch.Figure14plotsthetaildistribu-tion(P(fi�x))forallsegments.Althoughthelinerateisnothigh,thenumberofactive\rowsisalreadysigni -cant.Eachsegmentintrace1hasapproximately0:9million\rowsand20millionpackets,andeachsegmentintrace2hasapproximately0:7million\rowsand9millionpackets.Thestatisticsacrossdi erentsegmentswithinonetracearesimilar. 100 101 102 103 104 105 10-6 10-5 10-4 10-3 10-2 10-1 100 tail distribution. P(fi�x)flow size in packets trace 1, 12 segments Figure14:Taildistribution.Trace1hasheaviertracthantrace2andalsoaheaviertail.Infact,itistheheaviesttracewehaveencounteredsofar,andismuchheavierthan,forinstance,tracesplottedin[16].Theproportionofone-packet\rowsintrace1isonly0:11,similartothatofapower-lawdistributionwith =0:17.Flowswithsizelargerthan10packetsaredistributedsimilartoapowerlawwith 1.We xthesamesizingofCBforallsegments,mimickingtherealisticscenariowheretracvariesovertimeandCBisbuiltinhardware.Wepresenttheproportionof\rowsinerrorPerrandtheaverageerrormagnitudeEmforbothtracestogether.WevarythetotalnumberofbitsinCB7,denotedbyB,andpresenttheresultinTable3.Forallexperiments,weuseatwo-layerCBwithstatusbits,and3hashfunctionsatbothlayers.Thelayer-1coun-tersare8bitsdeepandthelayer-2countersare56bitsdeep.B(MB) 1:21:31:351:4 Perr 0:330:250:150 Em 31:91:20Table3:Simulationresultsofcounting2tracesin5-minutesegments,ona xed-sizeCBwithtotalspaceB.WeobserveasimilarphenomenonasinFigure12.Asweunderprovidespace,thereconstructionerrorincreasessigni cantly.However,theerrormagnituderemainssmall.Forthesetwotraces,1:4MBissucienttocountall\rowscorrectlyin5-minutesegments. 7Wearenotusingbitsper\rowheresincethenumberof\rowsisdi erentindi erentsegments.8.IMPLEMENTATION8.1On-ChipUpdatesEachlayerofCBcanbebuiltonaseparateblockofSRAMtoenablepipelining.Onpre-builtmemories,thecounterdepthischosentobeanintegerfractionofthewordlength,soastomaximizespaceusage.Thisconstraintdoesnotexistwithcustom-madememories.Weneedalistof\rowlabelstoconstructthe rst-layergraphforreconstruction.Incaseswhereaccessfrequenciesforpre- xesor ltersarebeingcollected,the\rownodesaresimplythesetofpre- xesor ltercriteria,whicharethesameacrossallmeasurementepochs.Henceno\rowlabelsneedtobecollectedortransferred.Inothercaseswherethe\rowlabelsareelementsofalargespace(e.g.\row5-tuples),thelabelsneedtobecollectedandtransferredtothedecodingunit.Themethodforcollecting\rowlabelsisapplication-speci c,andmaydependontheparticularimplementationoftheapplication.Wegivethefollowingsuggestionforcollecting\row5-tuplesinaspeci cscenario.ForTCP\rows,a\rowlabelcanbewrittentoaDRAMwhichmaintains\rowIDswhena\rowisestablished;forexample,whena\SYN"packetarrives.Since\rowsarees-tablishedmuchlessfrequentlythanpacketarrivals(approx-imatelyonein40packetscausesa\rowtobesetup[10]),thesememoryaccessesdonotcreateabottleneck.Flowsthatspanboundariesofmeasurementepochscanbeidenti- edusingaBloomFilter[3].Finally,weevaluatedthealgorithmbymeasuring\rowsizesinpackets.Thealgorithmcanbeusedtomeasure\rowsizesinbytes.Sincemostbyte-countingisreallythecountingofbyte-chunks(e.g.32or64byte-chunks),thereisthequestionofchoosingthe\rightgranularity":asmallvaluegivesaccuratecountsbutusesmorespaceandviceversa.Weareworkingonaniceapproachtothisproblemandwillreportresultsinfuturepublications.8.2ComputationCostofDecoderWereconstructthe\rowsizesusingtheiterativemessagepassingalgorithminanoineunit.Thedecodingcom-plexityislinearinthenumberof\rows.DecodingCBwithmorethanonelayerimposesonlyasmalladditionalcost,sincethehigherlayersare12orderssmallerthanthe rstlayer.Forexample,decoding1million\rowsonatwo-layerCounterBraidstakes,onaverage,15secondsona2:6GHzmachine.9.CONCLUSIONANDFURTHERWORKWepresentedCounterBraids,aecientminimum-spacecounterarchitecture,thatsolveslarge-scalenetworkmea-surementproblemssuchasper-\rowandper-pre xcounting.CounterBraidsincrementallycompressesthe\rowsizesasitcountsandthemessagepassingreconstructionalgorithmrecovers\rowsizesalmostperfectly.Weminimizecounterspacewithincrementalcompression,andsolvethe\row-to-counterassociationproblemusingrandomgraphs.Asshownfromrealtracesimulations,weareabletocountupto1mil-lion\rowspurelyinSRAMandrecovertheexact\rowsizes.WearecurrentlyimplementingthisinanFPGAtodeter-minetheactualmemoryusageandtobetterunderstandimplementationissues. Severaldirectionsareopenforfurtherexploration.Wementiontwo:(i)Sincea\rowpassesthroughmultiplerouters,andsinceouralgorithmisamenabletoadistributedimple-mentation,itwillsavecounterspacedramaticallytocom-binethecountscollectedatdi erentrouters.(ii)Sinceouralgorithm\degradesgracefully,"inthesensethatiftheamountofspaceislessthantherequiredamount,wecanstillrecovermany\rowsaccuratelyandhaveerrorsofknownsizeonafew,itisworthstudyingthegracefuldegradationformallyasa\lossycompression"problem.Acknowledgement:SupportforOC-48datacollectionisprovidedbyDARPA,NSF,DHS,CiscoandCAIDAmem-bers.ThisworkhasbeensupportedinpartbyNSFGrantNumber0653876,forwhichwearethankful.WealsothanktheCleanSlateProgramatStanfordUniversity,andtheStanfordGraduateFellowshipprogramforsupportingpartofthiswork.10.REFERENCES[1]http://www.cisco.com/warp/public/732/Tech/net\row.[2]Junipernetworkssolutionsfornetworkaccounting.www.juniper.net/techcenter/appnote/350003.html.[3]B.Bloom.Space/timetrade-o sinhashcodingwithallowableerrors.Comm.ACM,13,July1970.[4]J.W.Byers,M.Luby,M.Mitzenmacher,andA.Rege.Adigitalfountainapproachtoreliabledistributionofbulkdata.InSIGCOMM,pages56{67,1998.[5]G.Caire,S.Shamai,andS.Verdu.Noiselessdatacompressionwithlowdensityparitycheckcodes.InDIMACS,NewYork,2004.[6]E.CandesandT.Tao.Nearoptimalsignalrecoveryfromrandomprojectionsanduniversalencodingstrategies.IEEETrans.Inform.Theory,2004.[7]G.CormodeandS.Muthukrishnan.Animproveddatastreamsummary:thecount-minsketchanditsapplications.JournalofAlgorithms,55(1),April2005.[8]T.M.CoverandJ.A.Thomas.ElementsofInformationTheory.Wiley,NewYork,1991.[9]M.CrovellaandA.Bestavros.Self-similarityinworldwidewebtrac:Evidenceandpossiblecauses.IEEE/ACMTrans.Networking,1997.[10]S.DharmapurikarandV.Paxson.Robusttcpstreamreassemblyinthepresenceofadversaries.14thUSENIXSecuritySymposium,2005.[11]D.Donoho.Compressedsensing.IEEETrans.Inform.Theory,52(4),April2006.[12]C.EstanandG.Varghese.Newdirectionsintracmeasurementandaccounting.Proc.ACMSIGCOMMInternetMeasurementWorkshop,pages75{80,2001.[13]R.G.Gallager.Low-DensityParity-CheckCodes.MITPress,Cambridge,Massachussetts.[14]M.GrossglauserandJ.Rexford.Passivetracmeasurementforipoperations.TheInternetasaLarge-ScaleComplexSystem,2002.[15]F.Kschischang,B.Frey,andH.-A.Loeliger.Factorgraphsandthesum-productalgorithm.IEEETrans.Inform.Theory,47:498{519,2001.[16]A.Kumar,M.Sung,J.J.Xu,andJ.Wang.Datastreamingalgorithmsforecientandaccurateestimationof\rowsizedistribution.ProceedingsofACMSIGMETRICS,2004.[17]Y.Lu,A.Montanari,andB.Prabhakar.Detailednetworkmeasurementsusingsparsegraphcounters:Thetheory.AllertonConference,September2007.[18]M.Luby,M.Mitzenmacher,A.Shokrollahi,D.A.Spielman,andV.Stemann.Practicalloss-resilientcodes.InProc.ofSTOC,pages150{159,1997.[19]M.MezardandA.Montanari.ConstraintsatisfactionnetworksinPhysicsandComputation.InPreparation.[20]S.RamabhadranandG.Varghese.Ecientimplementationofastatisticscounterarchitecture.Proc.ACMSIGMETRICS,pages261{271,2003.[21]T.RichardsonandR.Urbanke.ModernCodingTheory.CambridgeUniversityPress,2007.[22]D.Shah,S.Iyer,B.Prabhakar,andN.McKeown.Analysisofastatisticscounterarchitecture.Proc.IEEEHotI9.[23]Q.G.Zhao,J.J.Xu,andZ.Liu.Designofanovelstatisticscounterarchitecturewithoptimalspaceandtimeeciency.SIGMetrics/Performance,June2006.Appendix:AsymptoticOptimalityWestatetheresultonasymptoticoptimalitywithoutaproof.Thecompleteproofcanbefoundin[17].Wemaketwoassumptionsonthe\rowsizedistributionp:1.Ithasatmostpower-lawtails.BythiswemeanthatPffixgAxforsomeconstantAandsome�0.Thisisavalidassumptionfornetworkstatistics[9].2.Ithasdecreasingdigitentropy.Writefiinitsq-aryexpansionPa0fi(a)qa.Lethl=PxP(fi(l)=x)logqP(fi(l)=x)betheq-aryentropyoffi(l).Thenhlismonotonicallydecreasinginlforanyqlargeenough.Wecalladistributionpwiththesetwopropertiesadmis-sible.Thisclassincludesmostcasesofpracticalinterest.Forinstance,anypower-lawdistributionisadmissible.The(binary)entropyofthisdistributionisdenotedbyH2(p)Pxp(x)log2p(x).Forthissectiononly,weassumethatallcountersinCBhaveanequaldepthofdbits.Letq=2.Definition2.WerepresentCBasasparsegraphG,withverticesconsistingofn\rowsandatotalofm(n)coun-tersinalllayers.AsequenceofCountersBraidsfGnghasdesignraterifr=limn!1m(n) nlog2q:(1)ItisreliableforthedistributionpifthereexistsasequenceofreconstructionfunctionsbFnbFGnsuchthatPerr(Gn;bFn)PfbFn(c)=fgn!0:(2)Hereisthemaintheorem:Theorem3.Foranyadmissibleinputdistributionp,andanyrater�H2(p)thereexistsasequenceofreliablesparseCounterBraidswithasymptoticrater.ThetheoremissatisfyingasitshowsthattheCBarchi-tectureisfundamentallygoodintheinformation-theoreticsense.Despitebeingincrementalandlinear,itisasgoodas,forexample,Hu mancodes,atin niteblocklength.