tumde Abstract Main memory capacities have grown up to a point where most databases 64257t into RAM For mainmemory database systems index structure performance is a critical bottleneck Traditional inmemory data structures like balanced binary search ID: 25536
Download Pdf The PPT/PDF document "The Adaptive Radix Tree ARTful Indexing ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
ProgrammingContest2009.TheKISS-Tree[20]isamoreefcientradixtreeproposalwithonlythreelevels,butcanonlystore32bitkeys.Itusesanopenaddressingschemefortherst16bitsofthekeyandreliesonthevirtualmemorysystemtosavespace.Thesecondlevel,responsibleforthenext10bits,usesanarrayrepresentationandthenallevelcompresses6bitsusingabitvector.TheideaofdynamicallychangingtheinternalnoderepresentationisusedbytheJudyarraydatastructurewhichwasdevelopedatHewlett-Packardresearchlabs[21],[22].Graefediscussesbinary-comparable(normalized)keys,e.g.[23],asawayofsimplifyingandspeedingupkeycomparisons.Weusethisconcepttoobtainmeaningfulorderforthekeysstoredinradixtrees.III.ADAPTIVERADIXTREEThissectionpresentstheadaptiveradixtree(ART).Westartwithsomegeneralobservationsontheadvantagesofradixtreesovercomparison-basedtrees.Next,wemotivatetheuseofadaptivenodesbyshowingthatthespaceconsumptionofconventionalradixtreescanbeexcessive.WecontinuewithdescribingARTandalgorithmsforsearchandinsertion.Finally,weanalyzethespaceconsumption.A.PreliminariesRadixtreeshaveanumberofinterestingpropertiesthatdistinguishthemfromcomparison-basedsearchtrees:Theheight(andcomplexity)ofradixtreesdependsonthelengthofthekeysbutingeneralnotonthenumberofelementsinthetree.Radixtreesrequirenorebalancingoperationsandallinsertionordersresultinthesametree.Thekeysarestoredinlexicographicorder.Thepathtoaleafnoderepresentsthekeyofthatleaf.Therefore,keysarestoredimplicitlyandcanbereconstructedfrompaths.Radixtreesconsistoftwotypesofnodes:Innernodes,whichmappartialkeystoothernodes,andleafnodes,whichstorethevaluescorrespondingtothekeys.Themostefcientrepresentationofaninnernodeisasanarrayof2spointers.Duringtreetraversal,ansbitchunkofthekeyisusedastheindexintothatarrayandtherebydeterminesthenextchildnodewithoutanyadditionalcomparisons.Theparameters,whichwecallspan,iscriticalfortheperformanceofradixtrees,becauseitdeterminestheheightofthetreeforagivenkeylength:Aradixtreestoringkbitkeyshasdk=selevelsofinnernodes.With32bitkeys,forexample,aradixtreeusings=1has32levels,whileaspanof8resultsinonly4levels.Becausecomparison-basedsearchtreesaretheprevalentindexingstructuresindatabasesystems,itisillustrativetocomparetheheightofradixtreeswiththenumberofcompar-isonsinperfectlybalancedsearchtrees.Whileeachcompar-isonrulesouthalfofallvaluesinthebestcase,aradixtreenodecanruleoutmorevaluesifs1.Therefore,radixtreeshavesmallerheightthanbinarysearchtreesforn2k=s.ThisrelationshipisillustratedinFigure2andassumesthatkeys Fig.2.Treeheightofperfectlybalancedbinarysearchtreesandradixtrees.canbecomparedinO(1)time.Forlargekeys,comparisonsactuallytakeO(k)timeandthereforethecomplexityofsearchtreesisO(klogn),asopposedtotheradixtreecomplexityofO(k).Theseobservationssuggestthatradixtrees,inparticularwithalargespan,canbemoreefcientthantraditionalsearchtrees.B.AdaptiveNodesAswehaveseen,froma(purelookup)performancestand-point,itisdesirabletohavealargespan.Whenarraysofpointersareusedtorepresentinnernodes,thedisadvantageofalargespanisalsoclear:Spaceusagecanbeexcessivewhenmostchildpointersarenull.ThistradeoffisillustratedinFigure3whichshowstheheightandspaceconsumptionfordifferentvaluesofthespanparameterwhenstoring1Muniformlydistributed32bitintegers.Asthespanincreases,thetreeheightdecreaseslinearly,whilethespaceconsumptionincreasesexponentially.Therefore,inpractice,onlysomevaluesofsofferareasonabletradeoffbetweentimeandspace.Forexample,theGeneralizedPrexTree(GPT)usesaspanof4bits[19],andtheradixtreeusedintheLinuxkernel(LRT)uses6bits[24].Figure3furthershowsthatouradaptiveradixtree(ART),atthesametime,useslessspaceandhassmallerheightthanradixtreesthatonlyusehomogeneousarraynodes.Thekeyideathatachievesbothspaceandtimeefciencyistoadaptivelyusedifferentnodesizeswiththesame,relativelylargespan,butwithdifferentfanout.Figure4illustratesthis Fig.3.Treeheightandspaceconsumptionfordifferentvaluesofthespanparameterswhenstoring1Muniformlydistributed32bitintegers.Pointersare8bytelongandnodesareexpandedlazily. areequal,anegativevalueiftherstnon-equalvalueoftherstvectorislessthanthecorrespondingbyteofthesecondvector,andotherwiseapositivevalue.Fornitedomains,itisalwayspossibletotransformtheval-uesofanystrictlytotallyordereddomaintobinary-comparablekeys:Eachvalueofadomainsizenismappedtoastringofdlog2nebitsstoringthezero-extendedrankminusone.B.TransformationsWenowdiscusshowcommondatatypescanbetransformedtobinary-comparablekeys.a)UnsignedIntegers:Thebinaryrepresentationofun-signedintegersalreadyhasthedesiredorder.However,theendiannessofthemachinemustbetakenintoaccountwhenstoringthevalueintomainmemory.Onlittleendianmachines,thebyteordermustbeswappedtoensurethattheresultisorderedfrommosttoleastsignicantbyte.b)SignedIntegers:Signedtwo-complementintegersmustbereorderedbecausenegativeintegersareorderedde-scendingandaregreaterthanthepositivevalues.Anbbitintegerxistransformedveryefcientlybyippingthesignbit(usingxXOR2b1).Theresultingvalueisthenstoredlikeanunsignedvalue.c)IEEE754FloatingPointNumbers:Foroatingpointvalues,thetransformationismoreinvolved,althoughconcep-tuallynotdifcult.Eachvalueisrstclassiedaspositiveornegative,andasnormalizednumber,denormalizednumber,NaN,1,or0.Becausethese10classesdonotoverlap,anewrankcaneasilybecomputedandthenstoredlikeanunsignedvalue.Onekeytransformationrequires3ifstatements,1integermultiplication,and2additions.d)CharacterStrings:TheUnicodeCollationAlgorithm(UCA)denescomplexrulesforcomparingUnicodestrings.Thereareopensourcelibrarieswhichimplementthisalgo-rithmandwhichofferfunctionstotransformUnicodestringstobinary-comparablekeys2.Ingeneral,itisimportantthateachstringisterminatedwithavaluewhichdoesnotappearanywhereelseinanystring(e.g.,the0byte).Thereasonisthatkeysmustnotbeprexesofotherkeys.e)Null:Tomakeanullvaluebinarycomparable,itmustbeassignedavaluewithsomeparticularrank.Formostdatatypes,allpossiblevaluesarealreadyused.Asimplesolutionistoincreasethekeylengthofallvaluesbyonetoobtainspaceforthenullvalue,e.g.,4byteintegersbecome5byteslong.Amoreefcientwaytoaccommodatethenullvalueistoincreasethelengthonlyforsomevaluesofthedomain.Forexample,assumingnullshouldbelessthanallother4byteintegers,nullcanbemappedtothebytesequence0,0,0,0,0,thepreviouslysmallestvalue0ismappedto0,0,0,0,1,andallothervaluesretaintheir4byterepresentation.f)CompoundKeys:Keysconsistingofmultipleattributesareeasilyhandledbytransformingeachattributeseparatelyandconcatenatingtheresults.2TheC/C++libraryInternationalComponentsforUnicode(http://site.icu-project.org/),forexample,providestheucol_getSortKeyfunctionforthispurpose.V.EVALUATIONInthissection,weexperimentallyevaluateARTandcom-pareitsperformancetoalternativein-memorydatastructures,includingcomparison-basedtrees,hashing,andradixtrees.Theevaluationhastwoparts:First,weperformanumberofmicrobenchmarks,implementedasstandaloneprograms,withallevaluateddatastructures.Inthesecondpart,weintegratesomeofthedatastructuresintothemain-memorydatabasesystemHyPer.Thisallowsustoexecuteamorerealisticworkload,thestandardOLTPbenchmarkTPC-C.Weusedahigh-enddesktopsystemwithanIntelCorei73930KCPUwhichhas6cores,12threads,3.2GHzclockrate,and3.8GHzturbofrequency.Thesystemhas12MBshared,last-levelcacheand32GBquad-channelDDR3-1600RAM.WeusedLinux3.2in64bitmodeasoperatingsystemandGCC4.6ascompiler.Ascontestants,weusedaB+-treeoptimizedformainmemory(Cache-SensitiveB+-tree[CSB]),tworead-onlysearchstructuresoptimizedformodernx86CPUs(k-arysearchtree[kary],FastArchitectureSensitiveTree[FAST]),aradixtree(GeneralizedPrexTree[GPT]),andtwotextbookdatastructures(red-blacktree[RB],chainedhashtable[HT]usingMurmurHash64Afor64-bitplat-forms[25]).Forafaircomparison,weusedsourcecodeprovidedbytheauthorsifitwasavailable.ThiswasthecasefortheCSB+-Tree[26],k-arysearch[27],andtheGeneralizedPrexTree[28].Weusedourownimplementationfortheremainingdatastructures.WewereabletovalidatethatourimplementationofFAST,whichwemadeavailableonline[29],matchestheoriginallypublishednumbers.Tocalibrateforthedifferenthardware,weusedtheresultsfork-arysearchwhichwerepublishedinthesamepaper.OurimplementationofFASTuses2MBmemorypages,andalignsallcachelineblocksto64byteboundaries,assuggestedbyYamamuroetal.[30].However,becauseFASTandk-arysearchreturntherankofthekeyinsteadofthetupleidentier,thefollowingresultsincludeoneadditionallookupinaseparatearrayoftupleidentiersinordertoevaluateameaningfullookupinthedatabasecontext.Wehadtouse32bitintegersaskeysforthemicrobench-marksbecausesomeoftheimplementationsonlysupport32bitintegerkeys.Forsuchveryshortkeys,pathcompressionusuallyincreasesspaceconsumptioninsteadofreducingit.Therefore,weremovedthisfeatureforthemicrobenchmarks.Pathcompressionisenabledinthemorerealisticsecondpartoftheevaluation.Incontrasttocomparison-basedtreesandhashtables,theperformanceofradixtreesvarieswiththedistributionofthekeys.Wethereforeshowresultsfordensekeysrangingfrom1ton(ndenotesthesizeofthetreein#keys)andsparsekeyswhereeachbitisequallylikely0or1.Werandomlypermutedthedensekeys. Fig.12.Impactofskewonsearchperformance(16Mkeys).B.CachingEffectsLetusnowinvestigatecachingeffects.FormodernCPUs,cachesareextremelyimportant,becauseDRAMlatencyamountstohundredsofCPUcycles.Treestructures,inparticular,benetfromcachesverymuchbecausefrequentlyaccessednodesandthetoplevelsareusuallycached.Toquantifythesecachingeffects,wecomparetwotreestructures,ART(withdensekeys)andFAST,toahashtable.Randomlookup,whichweperformedsofar,istheworstcaseforcachesbecausethisaccesspatternhasbadtemporallocality.Inpractice,skewedaccesspatternsareverycommon,e.g.,recentordersareaccessedmoreoftenthanoldorders.WesimulatedsuchascenariobylookingupZipfdistributedkeysinsteadofrandomkeys.Figure12showstheimpactofincreasingskewontheperformanceofthethreedatastruc-tures.Alldatastructuresperformmuchbetterinthepresenceofskewbecausethenumberofcachemissesdecreases.Astheskewincreases,theperformanceofARTandthehashtableapproachestheirspeedinsmall,cacheresidenttrees.ForFASTthespeedupissmallerbecauseitrequiresmorecomparisonsandoffsetcalculationswhicharenotimprovedbycaching.Wenowturnourattentiontotheinuenceofthecachesize.Inthepreviousexperiments,weonlyperformedlookupsinasingletree.Asaconsequence,theentirecachewasutilized,becausetherewerenocompetingmemoryaccesses.Inpractice,cachesusuallycontainmultipleindexesandotherdata.Tosimulatecompetingaccessesandthereforeeffectivelysmallercaches,welookupkeysinmultipledatastructuresinaround-robinfashion.Eachdatastructurestores16Mrandom,densekeysandoccupiesmorethan128MB.Figure13showsthatthehashtableismostlyunaffected,asitdoesnotusecacheseffectivelyanyway,whiletheperformanceofthetreesimproveswithincreasingcachesize,becausemoreoften-traversedpathsarecached.With1 64thofthecache(192KB),ARTreachesonlyaboutonethirdoftheperformanceoftheentirecache(12MB).C.UpdatesBesidesefcientsearch,anindexingstructuremustsupportefcientupdatesaswell.Figure14showsthethroughputwhen Fig.13.Impactofcachesizeonsearchperformance(16Mkeys). Fig.14.Insertionof16Mkeysintoanemptyindexstructure.inserting16Mrandomkeysintoanemptystructure.AlthoughARTmustdynamicallyreplaceitsinternaldatastructuresasthetreegrows,itismoreefcientthantheotherdatastructures.Theimpactofadaptivenodesontheinsertionper-formance(incomparisonwithonlyusingNode256)is20%fortreeswith16Mdensekeys.Sincethespacesavingsfromadaptivenodescanbelarge,thisisusuallyaworthwhiletradeoff.Incomparisonwithincrementalinsertion,bulkinsertionincreasesperformancebyafactorof2.5forsparsekeysandby17%fordensekeys.Whensortedkeys,e.g.,surrogateprimarykeys,areinserted,theperformanceoforderedsearchtreesincreasesbecauseofcachingeffects.ForART,50millionsorted,densekeyscanbeinsertedpersecond.Onlythehashtabledoesnotbenetfromthesortedorderbecausehashingrandomizestheaccesspattern.FASTandthek-arysearchtreearestaticdatastructuresthatcanonlybeupdatedbyrebuildingthem,whichiswhytheywerenotincludedinthepreviousexperiment.Onepossibilityforusingread-onlydatastructuresinapplicationsthatrequireincrementalupdatesistouseadeltamechanism:Aseconddatastructure,whichsupportsonlineupdates,storesdiffer-encesandisperiodicallymergedwiththeread-onlystructure.Toevaluatethefeasibilityofthisapproach,weusedared-blacktreetostorethedeltaplusFASTasthemainsearchstructure,andcomparedittoART(withdensekeys)andahashtable.Weusedtheoptimalmergingfrequencybetween