/
thetermspatialdatamanagementjustasbroadastherangeofapplica-tions.InVLS thetermspatialdatamanagementjustasbroadastherangeofapplica-tions.InVLS

thetermspatialdatamanagementjustasbroadastherangeofapplica-tions.InVLS - PDF document

pamella-moone
pamella-moone . @pamella-moone
Follow
399 views
Uploaded On 2016-06-29

thetermspatialdatamanagementjustasbroadastherangeofapplica-tions.InVLS - PPT Presentation

SeveralshortersurveyshavebeenpublishedpreviouslyinvariousPhDthesessuchasOoi1990Kolovson1990Oosterom1990andSchiwietz1993Widmayer1991givesanoverviewofworkpublishedbefore1991Likethethes ID: 382135

SeveralshortersurveyshavebeenpublishedpreviouslyinvariousPh.D.thesessuchasOoi[1990] Kolovson[1990] Oosterom[1990] andSchiwietz[1993].Widmayer[1991]givesanover-viewofworkpublishedbefore1991.Likethethes

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "thetermspatialdatamanagementjustasbroada..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

thetermspatialdatamanagementjustasbroadastherangeofapplica-tions.InVLSICADandcartography,thistermreferstoapplicationsthatrelymostlyontwo-dimensionalorlayeredtwo-dimensionaldata.VLSIdataareusuallyrepresentedbyrectilinearpoly-linesorpolygonswhoseedgesareiso-oriented,thatis,paralleltothecoordi-nateaxes.Typicaloperationsincludeintersectionandgeometricrouting[ShekharandLiu1995].Cartographicdataarealsotwo-dimensionalwithpoints,lines,andregionsasbasicprim-itives.IncontrasttoVLSICAD,how-ever,theshapesareoftencharacterizedbyextremeirregularities.Commonop-erationsincludespatialsearchesandmapoverlay,aswellasdistance-relatedoperations.InmechanicalCAD,ontheotherhand,dataobjectsareusuallythree-dimensionalsolids.Theymayberepresentedinavarietyofdataformats,includingcelldecompositionschemes,constructivesolidgeometry(CSG),andboundaryrepresentations[KemperandWallrath1987].Yetotherapplicationsemphasizetheprocessingofunanalyzedimages,suchasX-raysandsatelliteim-agery,fromwhichfeaturesareex-tracted.Inthoseareas,thetermstialdatabaseimagedatabasesometimesevenusedinterchangeably.Strictlyspeaking,however,spatialdatabasescontainmultidimensionaldatawithexplicitknowledgeaboutob-jects,theirextent,andtheirpositioninspace.Theobjectsareusuallyrepre-sentedinsomevector-basedformat,andtheirrelativepositionsmaybeexplicitorimplicit(i.e.,derivablefromthein-ternalrepresentationoftheirabsolutepositions).Imagedatabasesoftenplacelessemphasisondataanalysis.Theyprovidestorageandretrievalforunana-lyzedpictorialdata,whicharetypicallyrepresentedinsomerasterformat.Techniquesdevelopedforthestorageandmanipulationofimagedatacanbeappliedtoothermediaaswell,suchasinfraredsensorsignalsorsound.Inthissurveyweassumethatthegoalistomanipulateanalyzedmultidi-mensionaldataandthatunanalyzedimagesarehandledonlyasthesourcefromwhichspatialdatacanbederived.Thechallengeforthedevelopersofaspatialdatabasesystemliesnotsomuchinprovidingyetanothercollectionofspecial-purposedatastructures.Rather,onehastofindabstractionsandarchitecturestoimplementgenericsys-tems,thatis,tobuildsystemswithge-nericspatialdata-managementcapabil-itiesthatcanbetailoredtotherequirementsofaparticularapplicationdomain.Importantissuesinthiscon-textincludethehandlingofspatialrep-resentationsanddatamodels,multidi-mensionalaccessmethods,andpictorialorspatialquerylanguagesandtheirThisarticleisasurveyofmultidimen-sionalaccessmethodstosupportsearchoperationsinspatialdatabases.Figure1,whichwasinspiredbyasimilargraphbyLuandOoi[1993],givesafirstoverviewofthediversityofexistingmultidimensionalaccessmethods.Thegoalisnottodescribeallofthesestruc-tures,buttodiscussthemostprominentones,topresentpossibletaxonomies,andtoestablishreferencestootherlit-1.INTRODUCTION2.ORGANIZATIONOFSPATIALDATA2.1WhatIsSpecialAboutSpatial?2.2DefinitionsandQueries3.BASICDATASTRUCTURES3.1One-DimensionalAccessMethods3.2MainMemoryStructures4.POINTACCESSMETHODS4.1MultidimensionalHashing4.2HierarchicalAccessMethods4.3Space-FillingCurvesforPointData5.SPATIALACCESSMETHODS5.1Transformation5.2OverlappingRegions5.3Clipping5.4MultipleLayers6.COMPARATIVESTUDIES7.CONCLUSIONSMultidimensionalAccessMethods·171ACMComputingSurveys,Vol.30,No.2,June1998 SeveralshortersurveyshavebeenpublishedpreviouslyinvariousPh.D.thesessuchasOoi[1990],Kolovson[1990],Oosterom[1990],andSchiwietz[1993].Widmayer[1991]givesanover-viewofworkpublishedbefore1991.LikethethesisbySchiwietz,however,hissurveyisavailableonlyinGerman.Samet'sbooks[1989,1990]presentthestateoftheartuntil1989.However,theyprimarilycoverquadtreesandre-lateddatastructures.Lomet[1991]dis-cussesthefieldfromasystems-orientedpointofview.Theremainderofthearticleisorga-nizedasfollows.Section2discussessomebasicpropertiesofspatialdataandtheirimplicationsforthedesignandimplementationofspatialdata-bases.Section3givesanoverviewofsometraditionaldatastructuresthathadanimpactonthedesignofmultidi-mensionalaccessmethods.Sections4and5formthecoreofthissurvey,pre-sentingavarietyofpointaccessmeth-ods(PAMs)andspatialaccessmethods(SAMs),respectively.SomeremarksabouttheoreticalandexperimentalanalysesarecontainedinSection6,andSection7concludesthearticle.2.ORGANIZATIONOFSPATIALDATA2.1WhatIsSpecialAboutSpatial?Toobtainabetterunderstandingoftherequirementsinspatialdatabasesys-tems,wefirstdiscusssomebasicprop-ertiesofspatialdata.First,spatialdatahaveacomplexstructure.Aspatialdataobjectmaybecomposedofasinglepointorseveralthousandsofpolygons, Figure1.Historyofmultidimensionalaccessmethods.172·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 arbitrarilydistributedacrossspace.Itisusuallynotpossibletostorecollec-tionsofsuchobjectsinasinglerela-tionaltablewithafixedtuplesize.Sec-ond,spatialdataareoftendynamic.Insertionsanddeletionsareinterleavedwithupdates,anddatastructuresusedinthiscontexthavetosupportthisdy-namicbehaviorwithoutdeterioratingovertime.Third,spatialdatabasestendtobelarge.Geographicmaps,forexam-ple,typicallyoccupyseveralgigabytesofstorage.Theintegrationofsecondaryandtertiarymemoryisthereforeessen-tialforefficientprocessing[Chenetal.1995].Fourth,thereisnostandardal-gebradefinedonspatialdata,althoughseveralproposalshavebeenmadeinthepast[Egenhofer1989;GuÈting1989;SchollandVoisard1989;GuÈtingandSchneider1993].Thismeansinparticu-larthatthereisnostandardizedsetofbaseoperators.Thesetofoperatorsde-pendsheavilyonthegivenapplicationdomain,althoughsomeoperators(suchasintersection)aremorecommonthanothers.Fifth,manyspatialoperatorsarenotclosed.Theintersectionoftwopolygons,forexample,mayreturnanynumberofsinglepoints,danglingedges,ordisjointpolygons.Thisisparticularlyrelevantwhenoperatorsareappliedconsecutively.Sixth,althoughthecom-putationalcostsvaryamongspatialda-tabaseoperators,theyaregenerallymoreexpensivethanstandardrela-tionaloperators.Animportantclassofgeometricoper-atorsthatneedsspecialsupportatthephysicallevelistheclassofspatialsearchoperators.Retrievalandupdateofspatialdataareusuallybasednotonlyonthevalueofcertainalphanu-mericattributesbutalsoonthespatiallocationofadataobject.Aretrievalqueryonaspatialdatabaseoftenre-quiresthefastexecutionofageometricsearchoperationsuchasapointorre-gionquery.Bothoperationsrequirefastaccesstothosedataobjectsinthedata-basethatoccupyagivenlocationinTosupportsuchsearchoperations,oneneedsspecialmultidimensionalac-cessmethods.Themainprobleminthedesignofsuchmethods,however,isthatthereexistsnototalorderingamongspatialobjectsthatpreservesspatialproximity.Inotherwords,thereisnomappingfromtwo-orhigher-di-mensionalspaceintoone-dimensionalspacesuchthatanytwoobjectsthatarespatiallycloseinthehigher-dimen-sionalspacearealsoclosetoeachotherintheone-dimensionalsortedsequence.Thismakesthedesignofefficientac-cessmethodsinthespatialdomainmuchmoredifficultthanintraditionaldatabases,whereabroadrangeofeffi-cientandwell-understoodaccessmeth-odsisavailable.Examplesforsuchone-dimensionalaccessmethods(alsocalledsinglekeystructures,althoughthattermissomewhatmisleading)includetheB-tree[BayerandMcCreight1972]andextendiblehashing[Faginetal.1979];seeSection3.1forabriefdiscus-sion.Apopularapproachtohandlingmultidimensionalsearchqueriescon-sistsoftheconsecutiveapplicationofsuchsinglekeystructures,oneperdi-mension.Unfortunately,thisapproachcanbeveryinefficient[Kriegel1984].Sinceeachindexistraversedindepen-dentlyoftheothers,wecannotexploitthepossiblyhighselectivityinonedi-mensiontonarrowdownthesearchintheremainingdimensions.Ingeneral,thereisnoeasyandobviouswaytoextendsinglekeystructuresinordertohandlemultidimensionaldata.Thereisavarietyofrequirementsthatmultidimensionalaccessmethodsshouldmeet,basedonthepropertiesofspatialdataandtheirapplications[Robinson1981;LometandSalzberg1989;Nievergelt1989]:Asdataobjectsarein-sertedanddeletedfromthedata-baseinanygivenorder,accessmethodsshouldcontinuouslykeeptrackofthechanges.Secondary/tertiarystorageman-Despitegrowingmainmemories,itisoftennotpossibletoMultidimensionalAccessMethods·173ACMComputingSurveys,Vol.30,No.2,June1998 holdthecompletedatabaseinmainmemory.Therefore,accessmethodsneedtointegratesecondaryandter-tiarystorageinaseamlessmanner.Broadrangeofsupportedopera-Accessmethodsshouldnotsupportjustoneparticulartypeofoperation(suchasretrieval)attheexpenseofothertasks(suchasde-Independenceoftheinputdataandinsertionsequence.Accessmethodsshouldmaintaintheirefficiencyevenwheninputdataarehighlyskewedortheinsertionsequenceischanged.Thispointisespeciallyimportantfordatathataredistrib-uteddifferentlyalongthevariousIntricateaccessmeth-odswithmanyspecialcasesareoftenerror-pronetoimplementandthusnotsufficientlyrobustforlarge-scaleapplications.Accessmethodsshouldadaptwelltodatabasegrowth.Timeefficiency.Spatialsearchesshouldbefast.Amajordesigngoalistomeettheperformancecharac-teristicsofone-dimensionalB-trees:first,accessmethodsshouldguar-anteealogarithmicworst-casesearchperformanceforallpossibleinputdatadistributionsregardlessoftheinsertionsequenceandsec-ond,thisworst-caseperformanceshouldholdforanycombinationofSpaceefficiency.Anindexshouldbesmallinsizecomparedtothedatatobeaddressedandthereforeguaranteeacertainstorageutiliza-Concurrencyandrecovery.Inmod-erndatabaseswheremultipleus-ersconcurrentlyupdate,retrieve,andinsertdata,accessmethodsshouldproviderobusttechniquesfortransactionmanagementwith-outsignificantperformancepenal-Minimumimpact.Theintegrationofanaccessmethodintoadata-basesystemshouldhaveminimumimpactonexistingpartsofthesys-2.2DefinitionsandQueriesWehavealreadyintroducedthetermmultidimensionalaccessmethodstode-notethelargeclassofaccessmethodsthatsupportsearchesinspatialdata-basesandarethesubjectofthissurvey.Withinthisclass,wedistinguishbe-pointaccessmethods(PAMs)andspatialaccessmethods(SAMs).Pointaccessmethodshaveprimarilybeende-signedtoperformspatialsearchesonpointdatabases(i.e.,databasesthatstoreonlypoints).Thepointsmaybeembeddedintwoormoredimensions,buttheydonothaveaspatialexten-sion.Spatialaccessmethods,however,canmanageextendedobjects,suchaslines,polygons,orevenhigher-dimen-sionalpolyhedra.Intheliterature,oneoftenfindsthetermspatialaccessreferringtowhatwecalldimensionalaccessmethod.OthertermsusedforthispurposeincludespatialindexstructureWegenerallyassumethatthegivenobjectsareembeddedinEuclideanspaceorasuitablesub-spacethereof.Inthisarticle,thisspaceisalsoreferredtoastheoriginalspace.Anypointobjectstoredinaspatialdatabasehasauniqueloca-tionintheuniverse,definedbyitscoordinates.Unlessthedistinctionisessential,weusethetermbothforlocationsinspaceandforpointobjectsstoredinthedatabase.Note,however,thatanypointinspacecanbeoccupiedbyseveralpointobjectsstoredinthed-dimensionalpolytopePisdefinedtobetheintersectionofsomefinitenumberofclosedhalfspaces,suchthatthedimensionofthesmallestaffinesubspacecontaining.IffEd2{0}anddE1thenthe1)-dimensionalsetset174·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 }definesa.Ahyperplane)definestwoclosedhalfspaces:thepositivehalfspacehalfspaceEd:xza$c},andthenegativehalfspacehalfspaceEd:xza#c}.Ahyperplane)supportsapolytope),thatis,if)embedspartsof'sboundary.If)isanyhyperplanesupporting)isa.Thefacesofdimension1arecalledthoseofdimension0Byformingtheunionofsomefinitenumberofpolytopes,...,,weobtaina(polyhedronQthatisnotnecessarilyconvex.Followingtheintuitiveunderstandingofpolyhedra,werequirethatthe1,...,)beconnected.Notethatthisstillallowsforpolyhedrawithholes.Eachfaceofiseitherthefaceofsome,orafractionthereof,ortheresultoftheintersectionoftwoormore.Eachdividesthepointsinspaceintothreesubsetsthataremutuallydisjoint:itsinterior,itsboundary,anditsexterior.Followingusualconventions,weusethetermstodenoteaone-dimensionalpolyhedronandthetodenoteatwo-dimensionalpolyhedron.Wefur-therassumethatforeachthesetof-dimensionalpolyhedraformsadatatype,whichleadsustothecommoncollectionofspatialdatatypes,...}.Combinedtypessometimesalsooccur.Anobjectinaspatialdatabaseisusuallydefinedbyseveralnonspatialattributesandoneattributeofsomespatialdatatype.Thisspatialattributedescribestheobject'sspatialextent.Inthespatialdata-baseliterature,theterms,andspatialextensionareoftenusedinsteadofspatialextent.Forthedescriptionofonefindsthetermsshapedescriptorshapedescriptionshapeinformation,andgeometricde-,amongothers.Indicesoftenperformmoreefficientlywhenhandlingsimpleentriesofthesamesize.Onethereforeoftenabstractsfromtheactualshapeofaspatialobjectbeforeinsertingitintoanindex.Thiscanbeachievedbyapproximatingtheoriginaldataobjectwithasimplershape,suchasaboundingboxorasphere.Givenaminimumboundingin-in-li,ui](li,ui[E1)de-scribingtheextentofthespatialobjectalongdimension,theminimumboundingbox(MBB)isde-finedbyAnindexmayadministeronlytheMBBofeachobject,togetherwitha Figure2.Multistepspatialqueryprocessing[Brinkhoffetal.1994].MultidimensionalAccessMethods·175ACMComputingSurveys,Vol.30,No.2,June1998 pointertotheobject'sdatabaseentry(objectIDorobjectreference).Withthisdesign,theindexproducesonlyasetofcandidatesolutions(Figure2).Foreachcandidateobtainedduringthisfilterstep,wehavetodecidewhethertheMBBissufficienttoguaranteethattheobjectitselfsatisfiesthesearchpredi-cate.Inthosecases,theobjectcanbeaddeddirectlytothequeryresult(dashedline).However,thereareoftencaseswheretheMBBdoesnotprovesufficient.Inarefinementstepwethenmustretrievetheexactshapeinforma-tionfromsecondarymemoryandtestitagainstthepredicate.Ifthepredicateevaluatestotrue,theobjectisaddedtothequeryresult;otherwisewehaveafalsedrop.Anotherwayofobtainingsimplein-dexentriesistorepresenttheshapeofeachdataobjectasthegeometricunionofsimplershapes(e.g.,convexpolygonswithaboundednumberofvertices).ThisapproachiscalledWehavementionedthetermseveraltimessofarwithoutgiv-ingaformaldefinition.Inthecaseofspaceefficiency,thiscaneasilybedone:thegoalistominimizethenumberofbytesoccupiedbytheindex.Fortimeefficiencythesituationisnotsoclear.Elapsedtimeisobviouslywhattheusercaresabout,butoneshouldkeepinmindthatthecorrespondingmeasure-mentsgreatlydependonimplementa-tiondetails,hardwareutilization,andotherexternalfactors.Intheliterature,onethereforeoftenfindsaseeminglymoreobjectiveperformancemeasure:thenumberofdiskaccessesperformedduringasearch.Thisapproach,whichhasbecomepopularwiththeB-tree,isbasedontheassumptionthatmostsearchesareI/O-boundratherthanCPU-boundÐanassumptionthatisnotalwaystrueinspatialdatamanage-ment,however.Inapplicationswhereobjectshavecomplexshapes,therefine-mentstepcanincurmajorCPUcostsandchangethebalancewithI/O[Gaede1995b;HoelandSamet1995].Ofcourse,oneshouldkeeptheminimiza-tionofthenumberofdiskaccessesinmindasonedesigngoal.Practicaleval-uations,however,shouldalwaysgivesomeinformationonelapsedtimesandtheconditionsunderwhichtheywereAsnotedpreviously,incontrasttorelationaldatabases,thereexistsnei-therastandardspatialalgebranorastandardspatialquerylanguage.Thesetofoperatorsstronglydependsonthegivenapplicationdomain,althoughsomeoperators(suchasintersection)aregenerallymorecommonthanoth-ers.QueriesareoftenexpressedbysomeextensionofSQLthatallowsab-stractdatatypestorepresentspatialobjectsandtheirassociatedoperators[RoussopoulosandLeifker1984;Egen-hofer1994].Theresultofaqueryisusuallyasetofspatialdataobjects.Intheremainderofthissection,wegiveaformaldefinitionofseveralofthemorecommonspatialdatabaseoperators.Figures3through8givesomeconcreteQuery1(ExactMatchQueryEMQ,ObjectQuery).Givenanobjectspatialextent,findallobwiththesamespatialextentasQuery2(PointQueryPQ).GivenaaEd,findallobjects Figure3.Pointquery.176·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 Thepointquerycanberegardedasaspecialcaseofseveralofthefollowingqueries,suchastheintersectionquery,thewindowquery,ortheenclosureQuery3(WindowQueryWQ,RangeGivena-dimensionalintervalintervall1,u1]3[l2,u2]3...3[ld,ud],findallobjectshavingatleastonepointincommonwithThequeryimpliesthatthewindowisiso-oriented;thatis,itsfacesareparal-leltothecoordinateaxes.Amoregen-eralvariantistheregionquerypermitssearchregionstohavearbitraryorientationsandshapes.Query4(IntersectionQueryIQ,Re-gionQuery,OverlapQuery).GivenanwithspatialextentfindallobjectshavingatleastonepointincommonwithQuery5(EnclosureQueryEQ).anobjectwithspatialextent Figure4.Windowquery. Figure5.Intersectionquery. Figure6.Enclosurequery. Figure7.Containmentquery.MultidimensionalAccessMethods·177ACMComputingSurveys,Vol.30,No.2,June1998 ,findallobjectsQuery6(ContainmentQueryCQ).Givenanobjectwithspatialextent,findallobjectsenclosedbyTheenclosureandthecontainmentqueryaredualsofeachother.Theyarebothmorerestrictiveformulationsoftheintersectionquerybyspecifyingtheresultoftheintersectiontobeoneofthetwoinputs.Query7(AdjacencyQueryAQ).anobjectwithspatialextent,findallobjectsadjacenttoÉandÉdenotetheinteri-orsofthespatialextentsQuery8(Nearest-NeighborQueryGivenanobjectwithspatial,findallobjectshavingaminimumdistancefromThedistancebetweenextendedspatialdataobjectsisusuallydefinedasthedistancebetweentheirclosestpoints.CommondistancefunctionsforpointsincludetheEuclideanandtheManhat-tandistance.Besidesspatialselections,asexempli-fiedbyQueries1through8,theisoneofthemostimportantspatialoperationsandcanbedefinedasfollowsfollowsÈnther1993]:Query9(SpatialJoin).Giventwocol-ofspatialobjectsandaspatialpredicate,findallpairsofob-jects((R3Swhereu(o.G,o9.G)evaluatestotrue:true:R`o9[S`u~o.G,o9.G!}.Asforthespatialpredicate,abriefsurveyoftheliteratureyieldsawidevarietyofpossibilities,includingincludingcontains[is_enclosed_by[distance[Qq,withwith{5,#,,,$,.}andq[E1northwest[adjacent[.Acloserinspectionofthesespatialpred-icatesshowsthattheintersectionjoinplaysacrucialroleforthecomputationinvirtuallyallthesecases[GaedeandRiekert1994].Forpredicatessuchas,or,forexample,theintersectionjoinisanefficientfilterthatyieldsasetofcandidatesolutionstypicallymuch SeeOrenstein[1986],Becker[1992],Rotem[1991],GuÈnther[1993],Brinkhoffetal.[1993a],GaedeandRiekert[1994],Brinkhoff[1994],LoandRavishankar[1994],ArefandSamet[1994],Papadiasetal.[1995]andGuÈntheretal.[1998]. Figure8.Adjacencyquery.178·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 smallerthantheCartesianproduct3.BASICDATASTRUCTURES3.1One-DimensionalAccessMethodsClassicalone-dimensionalaccessmeth-odsareanimportantfoundationforal-mostallmultidimensionalaccessmeth-ods.AlthoughtherelatedsurveysbyKnott[1975]andComer[1979]aresomewhatdated,theyrepresentgoodcoverageofthedifferentapproaches.Inpractice,themostcommonone-dimen-sionalstructuresincludelinearhashing[Litwin1980;Larson1980],extendiblehashing[Faginetal.1979],andtheB-tree[BayerandMcCreight1972].Hi-erarchicalaccessmethodssuchastheB-treearescalableandbehavewellinthecaseofskewedinput;theyarenearlyindependentofthedistributionoftheinputdata.Thisisnotnecessarilytrueforhashingtechniques,whoseper-formancemaydegeneratedependingonthegiveninputdataandhashfunction.Thisproblemisaggravatedbytheuseoforder-preservinghashfunctions[Orenstein1983;GargandGotlieb1986]thattrytopreserveneighborhoodrelationshipsbetweendataitemsinor-dertosupportrangequeries.Asare-sult,highlyskeweddatakeepaccumu-latingatafewselectedlocationsinimagespace.LinearHashing[Larson1980;Litwin1980].Linearhashingtheuniverse[)ofpossiblehashval-uesintobinaryintervalsofsize(or(forsomeEachintervalcorrespondstoathatis,acollectionofrecordsstoredonadiskpage.page.[A,B)isapointerthatseparatesthesmallerintervalsfromthelargerones:allintervalsofsize(aretotheleftofandallintervalsofsize(aretotherightofIfabucketreachesitscapacityduetoaninsertion,theinterval[)issplitintotwosubintervalsofequalsize,andisadvancedtothenextlargeintervalremaining.Notethatthesplitintervalneednotbethesamein-tervalastheonethatcausedthesplit;consequently,thereisnoguaranteethatthesplitrelievesthebucketinquestionfromitsoverload.Ifanintervalcontainsmoreobjectsthanbucketcapacityper-mits,theoverloadisstoredonanover-flowpage,whichislinkedtotheorigi-nalpage.When,thefilehasdoubledandallintervalshavethesamelength(.Inthiscaseweresetthepointerandresumethesplitprocedureforthesmallerinter-ExtendibleHashing[Faginetal.1979].Asdoeslinearhashing,tendiblehashingorganizesthedatainbinaryintervals,herecalled.Over-flowpagesareavoidedinextendiblehashingbyusingacentralEachcellhasanindexentryinthatdirectory;itinitiallycorrespondstoonebucket.Ifduringaninsertionabucketatmaximaldepthexceedsitsmaximumcapacity,allcellsaresplitintotwo.Newindexentriesarecreatedandthedirec-torydoublesinsize.Sinceeachbucketwasnotatfullcapacitybeforethesplit,itmaynowbepossibletofitmorethanonecellinthesamebucket.Inthatcase,adjacentcellsareregroupedindataregionsandstoredonthesamediskpage.Inthecaseofskeweddatathismayleadtoasituationwherenu-merousdirectoryentriesexistforthesamedataregion(andthereforethesamediskpage).Eveninthecaseofuniformlydistributeddata,theaveragedirectorysizeis)andthereforesuperlinear[Flajolet1983].Herenotesthebucketsizeandisthenum-berofindexentries.Exactmatchsearchestakenomorethantwopageaccesses:oneforthedirectoryandoneforthebucketwiththedata.Thisismorethanthebest-caseperformanceoflinearhashing,butbetterthantheworstcase.Besidesthepotentiallypoorspaceuti-lizationoftheindex,extendiblehashingalsosuffersfromanonincrementalMultidimensionalAccessMethods·179ACMComputingSurveys,Vol.30,No.2,June1998 growthoftheindexduetothedoublingsteps.Toaddresstheseproblems,Lomet[1983]proposedatechniquebounded-indexextendiblehash-.Inthisproposal,theindexgrowsasinextendiblehashinguntilitssizereachesapredeterminedmaximum;thatis,theindexsizeisbounded.Oncethislimitisreachedwhileinsertingnewitems,bounded-indexextendiblehash-ingstartsdoublingthedatabucketsizeratherthantheindexsize.TheB-Tree[BayerandMc-Creight1972].Otherthanhashingschemes,theB-treeanditsvariants[Comer1979]organizethedatainahierarchicalmanner.B-treesarebal-ancedtreesthatcorrespondtoanestingofintervals.Eachnodecorrespondstoadiskpage)andaninterval).Ifisaninteriornodethentheintervals)correspondingtotheimmediatede-scendantsofaremutuallydisjointsubsetsof).Leafnodescontainpointerstodataitems;dependingonthetypeofB-tree,interiornodesmaydosoaswell.B-treeshaveanupperandlowerboundforthenumberofdescen-dantsofanode.Thelowerboundpre-ventsthedegenerationoftreesandleadstoanefficientstorageutilization.Nodeswhosenumberofdescendantsdropsbelowthelowerboundaredeletedandtheircontentsdistributedamongtheadjacentnodesatthesametreelevel.Theupperboundfollowsfromthefactthateachtreenodecorrespondstoexactlyonediskpage.Ifduringanin-sertionanodereachesitscapacity,itissplitintwo.Splitsmaypropagateupthetree.Asthesizeoftheintervalsdependsonthegivendata(andthein-sertionsequence),theB-treeisanadap-tivedatastructure.Foruniformlydis-tributeddata,however,extendibleaswellaslinearhashingoutperformtheB-treeontheaverageforexactmatchqueries,insertions,anddeletions.3.2MainMemoryStructuresEarlymultidimensionalaccessmethodsdidnottakeintoaccountpagedsecond-arymemoryandarethereforelesssuitedforlargespatialdatabases.Inthissection,wereviewseveralofthesefundamentaldatastructures,whichareadaptedandincorporatedinnumerousmultidimensionalaccessmethods.Toil-lustratethemethods,weintroduceasmallscenariothatweuseasarunningexamplethroughoutthissurvey.Thescenario,depictedinFigure9,contains10pointsand10polygons,ran-domlydistributedinafinitetwo-dimen-sionaluniverse.Torepresentpolygons,weoftenusetheircentroids(notpic-tured)ortheirminimumboundingboxes(MBBs).NotethatthequalityoftheMBBapproximationvariescon-siderably.TheMBBm8,forexample,providesafairlytightfit,whereasr5isonlyabouthalfaslargeasitsMBBm5.Thek-d-Tree[Bentley1975,Oneofthemostprominentdimensionaldatastructuresisthe.Thek-d-treeisabinarysearchtreethatrepresentsarecursivesubdivisionoftheuniverseintosubspacesbymeansof(1)-dimensionalhyperplanes.Thehyperplanesareiso-oriented,andtheirdirectionalternatesamongthepossibilities.For3,forexample,splittinghyperplanesarealternatelyperpendiculartothe-,andEachsplittinghyperplanehastocon-tainatleastonedatapoint,whichisusedforitsrepresentationinthetree.Interiornodeshaveoneortwodescen-dantseachandfunctionasdiscrimina-torstoguidethesearch.Searchingandinsertionofnewpointsarestraightfor-wardoperations.Deletionissomewhatmorecomplicatedandmaycauseareor-ganizationofthesubtreebelowthedatapointtobedeleted.Figure10showsak-d-treefortherunningexample.Becausethetreecanonlyhandlepoints,werepresentthepolygonsbytheircentroids.Thefirstsplittinglineistheverticallinecrossingc3.Wethereforestorec3intherootofthecorrespondingk-d-tree.Thenextsplitsoccuralonghorizontallinescross-180·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 ingp10(fortheleftsubtree)andc7(fortherightsubtree),andsoon.Onedisadvantageofthek-d-treeisthatthestructureissensitivetotheorderinwhichthepointsareinserted.Anotheroneisthatdatapointsarescat-teredalloverthetree.Theadaptivek-d-tree[BentleyandFriedman1979]mitigatestheseproblemsbychoosingasplitsuchthatonefindsaboutthesamenumberofelementsonbothsides.Al-thoughthesplittinghyperplanesarestillparalleltotheaxes,theyneednotcontainadatapointandtheirdirectionsneednotbestrictlyalternatingany-more.Asaresult,thesplitpointsarenotpartoftheinputdata;alldatapointsarestoredintheleaves.Interiornodescontainthedimension(e.g.,)andthecoordinateofthecorrespond-ingsplit.Splittingiscontinuedrecur-sivelyuntileachsubspacecontainsonlyacertainnumberofpoints.Theadap-tivek-d-treeisaratherstaticstructure;itisobviouslydifficulttokeepthetreebalancedinthepresenceoffrequentinsertionsanddeletions.Thestructureworksbestifallthedataareknowna Figure9.Runningexample. Figure10.MultidimensionalAccessMethods·181ACMComputingSurveys,Vol.30,No.2,June1998 prioriandifupdatesarerare.Figure11showsanadaptivek-d-treefortherun-ningexample.Notethatthetreestilldependsontheorderofinsertion.Anothervariantofthek-d-treeisthe[Tamminen1984].Thisstruc-turepartitionstheuniverserecursively-dimensionalboxesofequalsizeuntileachonecontainsonlyacertainnumberofpoints.Eventhoughthiskindofpartitioningislessadaptive,ithasseveraladvantages,suchastheim-plicitknowledgeofthepartitioninghy-perplanes.Intheremainderofthisarti-cle,weencounterseveralotherstructuresbasedonthiskindofparti-Adisadvantagecommontoallk-d-treesisthatforcertaindistributionsnohyperplanecanbefoundthatsplitsthedatapointsevenly[LometandSalzberg1989].Byintroducingamoreflexiblepartitioningscheme,thefollowingBSP-treeavoidsthisproblemcompletely.TheBSP-Tree[Fuchsetal.1980,1983].Splittingtheuniverseonlyalongiso-orientedhyperplanesisase-vererestrictionintheschemespre-sentedsofar.Allowingarbitraryorien-tationsgivesmoreflexibilitytofindahyperplanethatiswellsuitedforthesplit.Awell-knownexampleforsuchamethodisthebinaryspacepartitioning.Likek-d-trees,BSP-treesarebinarytreesthatrepresentarecur-sivesubdivisionoftheuniverseintosubspacesbymeansof(1)-dimen-sionalhyperplanes.Eachsubspaceissubdividedindependentlyofitshistoryandoftheothersubspaces.Thechoiceofthepartitioninghyperplanesdependsonthedistributionofthedataobjectsinagivensubspace.Thedecompositionusuallycontinuesuntilthenumberofobjectsineachsubspaceisbelowagiventhreshold.Theresultingpartitionoftheuni-versecanberepresentedbyaBSP-treeinwhicheachhyperplanecorrespondstoaninteriornodeofthetreeandeachsubspacecorrespondstoaleaf.Eachleafstoresreferencestothoseobjectsthatarecontainedinthecorrespondingsubspace.Figure12showsaBSP-treefortherunningexamplewithnomorethantwoobjectspersubspace.Inordertoperformapointquery,weinsertthesearchpointintotherootofthetreeanddetermineonwhichsideofthecorrespondinghyperplaneitislo-cated.Next,weinsertthepointintothecorrespondingsubtreeandproceedre-cursivelyuntilwereachaleafofthetree.Finally,weexaminethedataob-jectsinthecorrespondingsubspacetoseewhethertheycontainthesearchpoint.Therangesearchalgorithmisastraightforwardgeneralization.BSP-treescanadaptwelltodifferentdatadistributions.However,theyaretypicallynotbalancedandmayhaveverydeepsubtrees,whichhasanega- Figure11.Adaptivek-d-tree.182·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 tiveimpactonthetreeperformance.BSP-treesalsohavehigherspacere-quirements,sincestoringanarbitraryhyperplanepersplitoccupiesmorestor-agespacethanasimplediscriminator,whichistypicallyjustarealnumber.TheBD-Tree[OhsawaandSakauchi1983].isabi-narytreerepresentingasubdivisionofthedataspaceintointerval-shapedre-gions.EachofthoseregionsisencodedinabitstringandassociatedwithoneoftheBD-treenodes.Here,thesebitstringsarecalled;theyarealsoknownasPeanocodes,or(cf.SectionGivenaregion,onecomputesthecorrespondingDZ-expressionasfollows.Forsimplicitywerestrictthispresenta-tiontothetwo-dimensionalcase;wealsoassumethatthefirstsubdividinghyperplaneisaverticalline.Ifliestotheleftofthatline,thefirstbitofthecorrespondingDZ-expressionis0;other-wiseitis1.Inthenextstep,wesubdi-videthesubspacecontainingbyahorizontalline.Ifliesbelowthatline,thesecondbitoftheDZ-expressionis0,otherwiseitis1.Asthisdecompositionprogresses,weobtainonebitpersplit-tingline.Bitsatoddpositionsrefertoverticallinesandbitsatevenpositionstohorizontallines,whichexplainswhythisschemeisoftenreferredtoasToavoidthestorageutilizationprob-lemsthatareoftenassociatedwithastrictlyregularpartitioning,theBD-treeemploysamoreflexiblesplittingpolicy.Hereonecansplitanodebymakinganinterval-shapedexcisionfromthecorrespondingregion.Thetwochildnodesofthenodetobesplitwillthenhavedifferentinterpretations:onerepresentstheexcision;theotheronerepresentstheremainderoftheoriginalregion.Notethattheremainingregionisnolongerinterval-shaped.Withthispolicy,theBD-treecanguaranteethat,afternodesplitting,eachofthedatabucketscontainsatleastonethirdoftheoriginalentries.Figure13showsaBD-treefortherunningexample.Anexcisionisalwaysrepresentedbytheleftchildofthenodethatwassplit.Foranexactmatchwefirstcomputethefullbit-interleavedprefixofthesearchrecord.Startingfromtheroot,werecursivelycomparethisprefixwiththestoredDZ-expressionsofeachinter-nalnode.Ifitmatches,wefollowthecorrespondinglink;otherwisewefollowtheotherlinkuntilwereachtheleafleveloftheBD-tree.MoresophisticatedalgorithmswereproposedlaterbyDan-damudiandSorenson[1986,1991].TheQuadtree.Thequadtreewithitsmanyvariantsiscloselyrelatedtothek-d-tree.Foranextensivediscus-sionofthisstructure,seeSamet[1984, Figure12.MultidimensionalAccessMethods·183ACMComputingSurveys,Vol.30,No.2,June1998 1990a,b].Althoughthetermusuallyreferstothetwo-dimensionalvariant,thebasicideaappliestoan.Likethek-d-tree,thequadtreedecomposestheuniversebymeansofiso-orientedhyperplanes.Animportantdifference,however,isthefactthatquadtreesarenotbinarytreesanymore.Indimensions,theinteriornodesofaquadtreehave2dants,eachcorrespondingtoaninter-val-shapedpartitionofthegivensub-space.Thesepartitionsdonothavetobeofequalsize,althoughthatisoftenthecase.For2,forexample,eachinteriornodehasfourdescendants,eachcorrespondingtoarectangle.TheserectanglesaretypicallyreferredtoastheNW,NE,SW,andSE(northwest,etc.)quadrants.Thedecompositionintosubspacesisusuallycontinueduntilthenumberofobjectsineachpartitionisbelowagiventhreshold.Quadtreesarethereforenotnecessarilybalanced;sub-treescorrespondingtodenselypopu-latedregionsmaybedeeperthanoth-Searchinginaquadtreeissimilartosearchinginanordinarybinarysearchtree.Ateachlevel,onehastodecidewhichofthefoursubtreesneedbein-cludedinthefuturesearch.Inthecaseofapointquery,typicallyonlyonesub-treequalifies,whereasforrangequeriesthereareoftenseveral.Werepeatthissearchsteprecursivelyuntilwereachtheleavesofthetree.FinkelandBentley[1974]proposedoneofthefirstquadtreevariants:thepointquadtree,essentiallyamultidi-mensionalbinarysearchtree.Thepointquadtreeisconstructedconsecutivelybyinsertingthedatapointsonebyone.Foreachpoint,wefirstperformapointsearch.Ifwedonotfindthepointinthetree,weinsertitintotheleafnodewherethesearchhasterminated.Thecorrespondingpartitionisdividedintosubspaceswiththenewpointatthecenter.Thedeletionofapointrequirestherestructuringofthesubtreebelowthecorrespondingquadtreenode.Asimplewaytoachievethisistoreinsertallpointsintothesubtree.Figure14showsatwo-dimensionalpointquadtreefortherunningexample.Anotherpopularvariantisthe[Samet1984].Regionquadtreesarebasedonaregulardecompositionoftheuniverse;thatis,the2resultingfromapartitionarealwaysofequalsize.Thisgreatlyfacilitatessearches.Fortherunningexample,Fig-ure15showshowregionquadtreescanbeusedtorepresentsetsofpoints.Herethethresholdforthenumberofpointsinanygivensubspacewassettoone.Inmorecomplexversionsoftheregionquadtree,suchasthePMquadtree[SametandWebber1985],itisalsopossibletostorepolygonaldatadirectly.PMquadtreesdividethequadtreere-gions(andthedataobjectsinthem)untiltheycontainonlyasmallnumberofpolygonedgesorvertices.Theseedgesorvertices(whichtogetherform Figure13.184·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 anexactdescriptionofthedataobjects)arethenattachedtotheleavesofthetree.Anotherclassofquadtreestruc-tureshasbeendesignedforthemanage-mentofcollectionsofrectangles;seeSamet[1988]forasurvey.4.POINTACCESSMETHODSThemultidimensionaldatastructurespresentedintheprevioussectiondonottakesecondarystoragemanagementintoaccountexplicitly.Theywereorigi-nallydesignedformainmemoryappli-cationswhereallthedataareavailablewithoutaccessingthedisk.Despitegrowingmainmemories,thisisofcoursenotalwaysthecase.Inmanyspatialdatabaseapplications,suchasgeography,theamountofdatatobemanagedisnotoriouslylarge.Onecancertainlyusemainmemorystructuresfordatathatresideondisk,buttheirperformanceisoftenconsiderablybelowtheoptimumbecausethereisnocontroloverhowtheoperatingsystemperformsthediskaccesses.Theaccessmethodspresentedinthisandthefollowingsec-tionhavebeendesignedwithsecondarystoragemanagementinmind.Theirop-erationsarecloselycoordinatedwiththeoperatingsystemtoensurethatoverallperformanceisoptimized.Asmentionedbefore,wefirstpresentaselectionofpointaccessmethods.Usually,thepointsinthedatabaseareorganizedinanumberofbuckets,eachofwhichcorrespondstoadiskpageandtosomesubspaceoftheuniverse.Thesubspaces(oftencalleddataregionsbucketregions,orsimply,even Figure14.Pointquadtree. Figure15.Regionquadtree.MultidimensionalAccessMethods·185ACMComputingSurveys,Vol.30,No.2,June1998 thoughtheirdimensionmaybegreaterthantwo)neednotberectilinear,al-thoughtheyoftenare.Thebucketsareaccessedbymeansofasearchtreeor-dimensionalhashfunction.Thegridfile[Nievergeltetal.1984],forexample,usesadirectoryandagrid-likepartitionoftheuniversetoansweranexactmatchquerywithexactlytwodiskaccesses.Furthermore,therearemultidimensionalhashingschemes[Tamminen1982;KriegelandSeeger1986,1988],multilevelgridfiles[WhangandKrishnamurthy1985;Hut-fleszetal.1988b],andhashtrees[Ouk-sel1985;Otoo1985],whichorganizethedirectoryasatreestructure.Tree-basedaccessmethodsareusuallyagenerali-zationoftheB-treetohigherdimen-sions,suchasthek-d-B-tree[Robinson1981]orthehB-tree[LometandSalz-berg1989].Intheremainderofthissection,wefirstdiscusstheapproachesbasedonhashing,followedbyhierarchical(tree-based)methods,andspace-fillingcurves.Thisclassificationishardlyun-ambiguous,especiallyinthepresenceofanincreasingnumberofhybridap-proachesthatattempttocombinetheadvantagesofseveraldifferenttech-niques.OurapproachresemblestheclassificationofSamet[1990],whodis-tinguishesbetweenods(point/regionquadtrees,k-d-trees,rangetrees)andmethods(gridfile,EXCELL).Hisdiscussionoftheformerisprimarilyinthecontextofmainmemoryapplications.Ourpresen-tationfocusesthroughoutonstructuresthattakesecondarystoragemanage-mentintoaccount.AnotherinterestingtaxonomyhasbeenproposedbySeegerandKriegel[1990],whoclassifypointaccessmeth-odsbythepropertiesofthebucketre-gions(Table1).First,theymaybepair-wisedisjointortheymayhavemutualoverlaps.Second,theymayhavetheshapeofaninterval(box)orbeofsomearbitrarypolyhedralshape.Third,theymaycoverthecompleteuniverseorjustthosepartsthatcontainsomedataob-jects.Thistaxonomyresultsineightclasses,fourofwhicharepopulatedbyexistingaccessmethods.4.1MultidimensionalHashingAlthoughthereisnototalorderforob-jectsintwo-andhigher-dimensionalspacethatcompletelypreservesspatialproximity,therehavebeennumerousattemptstoconstructhashingfunctionsthatpreserveproximityatleasttosomeextent.ThegoalofalltheseheuristicsisTable1.ClassificationofPAMsFollowingSeegerandKriegel[1990] 186·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 thatobjectslocatedclosetoeachotherinoriginalspaceshouldbelikelytobestoredclosetogetheronthedisk.Thiscouldcontributesubstantiallytomini-mizingthenumberofdiskaccessesperrangequery.Webeginourpresentationwithseveralstructuresbasedonex-tendiblehashing.StructuresbasedonlinearhashingarediscussedinSection4.1.5.Thediscussionoftwohybridmethods,theBANGfileandthebuddytree,ispostponeduntilSection4.2.TheGridFile(Nievergeltetal.Asatypicalrepresentativeofanaccessmethodbasedonhashing,wefirstdiscussthegridfileandsomeofitsThegridfilesuperimposesa-dimensionalorthogonalgridontheuniverse.Becausethegridisnotneces-sarilyregular,theresultingcellsmaybeofdifferentshapesandsizes.Agriddirectoryassociatesoneormoreofthesecellswithdatabuckets,whicharestoredononediskpageeach.Eachcellisassociatedwithonebucket,butabucketmaycontainseveraladjacentcells.Sincethedirectorymaygrowlarge,itisusuallykeptonsecondarystorage.Toguaranteethatdataitemsarealwaysfoundwithnomorethantwodiskaccessesforexactmatchqueries,thegriditselfiskeptinmainmemory,representedbyone-dimensionalar-rayscalledFigure16showsagridfilefortherunningexample.Weassumebucketca-pacitytobefourdatapoints.Thecenterofthefigureshowsthedirectorywith SeeHinrichs[1985],Ouksel[1985],WhangandKrishnamurthy[1985],SixandWidmayer[1988],andBlankenetal.[1990]. Figure16.Gridfile.MultidimensionalAccessMethods·187ACMComputingSurveys,Vol.30,No.2,June1998 scalesonthe-and-axes.Thedatapointsaredisplayedinthedirectoryfordemonstrationpurposesonly;theyarenot,ofcourse,storedthere.Inthelowerleftpart,fourcellsarecombinedintoasinglebucket,representedbyfourpointerstoasinglepage.Therearethusfourdirectoryentriesforthesamepage,whichillustratesawell-knownproblemofthegridfile:itsuffersfromasuper-lineargrowthofthedirectoryevenfordatathatareuniformlydistributed[Regnier1985;Widmayer1991].Thebucketregioncontainingthepointc5couldhavebeenmergedwithoneoftheneighboringbucketsforbetterstorageutilization.Wepresentvariousmergingstrategieslater,whenwediscussthedeletionofdatapoints.Toansweranexactmatchquery,onefirstusesthescalestolocatethecellcontainingthesearchpoint.Iftheap-propriategridcellisnotinmainmem-ory,onediskaccessisnecessary.Theloadedcellcontainsareferencetothepagewherepossiblymatchingdatacanbefound.Retrievingthispagemayre-quireanotherdiskaccess.Altogether,nomorethantwopageaccessesarenecessarytoanswerthisquery.Forarangequery,onemustexamineallcellsthatoverlapthesearchregion.Aftereliminatingduplicates,onefetchesthecorrespondingdatapagesintomemoryformoredetailedinspection.Toinsertapoint,onefirstperformsanexactmatchquerytolocatethecellandthedatapagewheretheentryshouldbeinserted.Ifthereissufficientspacelefton,thenewentryisin-serted.Ifnot,wehavetodistinguishtwocases,dependingonthenumberofgridcellsthatpointtothedatapagewherethenewdataitemistobein-serted.Ifthereareseveral,onecheckswhetheranexistinghyperplanestoredinthescalescanbeusedforsplittingthedatapagesuccessfully.Ifso,anewdatapageisallocatedandthedatapointsaredistributedaccordinglyamongthedatapages.Ifnoneoftheexistinghyperplanesissuitable,orifonlyonegridcellpointstothedatapageinquestion,asplittinghyperplaneisintroducedandanewdatapageisallocated.Thenewentryandtheen-triesoftheoriginalpageareredis-tributedamong,dependingontheirlocationrelativetoisin-sertedintothecorrespondingscale;allcellsthatintersectaresplitaccord-ingly.Splittingisthereforenotalocaloperationandcanleadtosuperlineardirectorygrowthevenforuniformlydis-tributeddata[Regnier1985;Freeston1987;Widmayer1991].Deletionisnotalocaloperationei-ther.Withthedeletionofanentry,thestorageutilizationofthecorrespondingdatapagemaydropbelowthegiventhreshold.Dependingonthecurrentpartitioningofspace,itmaythenbepossibletomergethispagewithaneighborpageandtodropthepartition-inghyperplanefromthecorrespondingscale.Dependingontheimplementationofthegriddirectory,mergingmayre-quireacompletedirectoryscan[Hin-richs1985].Hinrichsdiscussesseveralmethodsforfindingcandidateswithwhichagivendatabucketcanmerge,includingtheneighborsystemandthemultidimensionalbuddysystem.Theneighborsystemallowsmergingtwoad-jacentregionsiftheresultisarectan-gularregionagain.Inthebuddysys-tem,twoadjacentregionscanbemergedprovidedthatthejoinedregioncanbeobtainedbyaregularbinarysubdivisionoftheuniverse.Bothsys-temsarenotabletoeliminatecom-pletelythepossibilityofadeadlock,inwhichcasenomergingisfeasiblebe-causetheresultingbucketregionwouldnotbebox-shaped[Hinrichs1985;See-gerandKriegel1990].Foratheoreticalanalysisofthegridfileandsomeofitsvariants,seeReg-nier[1985]andBecker[1992].Regniershowsinparticularthatthegridfile'saveragedirectorysizeforuniformlydis-tributeddataisisbucketsize.Healsoprovesthattheaveragespaceoccupancyofthedatabucketsisabout69%(ln2).188·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 EXCELL[Tamminen1982].CloselyrelatedtothegridfileistheEXCELLmethod(ExtendibleCELL)proposedbyTamminen[1982].Incon-trasttothegridfile,wheretheparti-tioninghyperplanesmaybespacedar-bitrarily,theEXCELLmethoddecomposestheuniverseregularly:allgridcellsareofequalsize.Inordertomaintainthispropertyinthepresenceofinsertions,eachnewsplitresultsinthehalvingofallcellsandthereforeinthedoublingofthedirectorysize.Toalleviatethisproblem,Tamminen[1983]latersuggestedahierarchicalmethod,similartothemultilevelgridfileofWhangandKrishnamurthy[1985].Overflowpagesareintroducedtolimitthedepthofthehierarchy.TheTwo-LevelGridFile[Hin-richs1985].Thebasicideaofthetwo-levelgridfileistouseasecondgridfiletomanagethegriddirectory.Thefirstofthetwolevelsiscalledtherootdirec-,whichisacoarsenedversionofthesecondlevel,theactualgriddirectory.Entriesoftherootdirectorycontainpointerstothedirectorypagesofthelowerlevel,whichinturncontainpoint-erstothedatapages.Byhavingasec-ondlevel,splitsareoftenconfinedtothesubdirectoryregionswithoutaffect-ingtoomuchoftheirsurroundings.Eventhoughthismodificationleadstoaslowerdirectorygrowth,itdoesnotsolvetheproblemofsuperlineardirec-torysize.Furthermore,Hinrichsimplic-itlyassumesthatthesecondlevelcanbekeptinmainmemory,sothatthetwo-diskaccessprinciplestillholds.Figure17showsatwo-levelgridfilefortherunningexample.Eachcellintherootdirectoryhasapointertothecorre-spondingentriesinthesubdirectory,whichhavetheirownscalesinturn.TheTwinGridFile[Hutfleszetal.1988b].Thetwingridfiletriestoincreasespaceutilizationcomparedto Figure17.Two-levelgridfile.MultidimensionalAccessMethods·189ACMComputingSurveys,Vol.30,No.2,June1998 theoriginalgridfilebyintroducingasecondgridfile.Asindicatedbythenameªtwin,ºtherelationshipbetweenthesetwogridfilesisnothierarchical,asinthecaseofthetwo-levelgridfile,butsomewhatmorebalanced.Bothgridfilesspanthewholeuniverse.Thedis-tributionofthedataamongthetwofilesisperformeddynamically.Hutfleszetal.[1988b]reportanaverageoccupancyof90%forthetwingridfile(comparedto69%fortheoriginalgridfile)withoutsubstantialperformancepenalties.Toillustratetheunderlyingtech-nique,considertherunningexamplede-pictedinFigure18.Letusassumethateachbucketcanaccommodatefourpoints.Ifthenumberofpointsinabucketexceedsthatlimit,onepossibil-ityistocreateanewbucketandredis-tributethepointsamongthetwonewbuckets.Beforedoingthis,however,thetwingridfiletriestoredistributethepointsbetweenthetwogridfiles.Atransferofpointsfromtheprimaryfiletothesecondaryfilemayleadtoabucketoverflowin.Itmay,however,alsoimplyabucketunderflowinwhichmayinturnleadtoabucketmergeandthereforetoareductionofbucketsin.Theoverallobjectiveofthereshufflingistominimizethetotalnumberofbucketsinthetwogridfiles.Thereforeweshiftpointsfromifandonlyiftheresultingde-creaseinthenumberofbucketsinoutweighstheincreaseinthenumberofbucketsin.Thisstrategyalsofavorspointstobeplacedintheprimaryfileinordertoformlargeandemptybucketsinthesecondaryfile.Consequently,allpointsincanbeassociatedwithanemptyorafullbucketregionof.Notethatthereusuallyexistsnouniqueopti-mumforthedistributionofdatapointsbetweenthetwofiles.Thefactthatdatapointsmaybefoundineitherofthetwogridfilesrequiressearchoperationstovisitthetwofiles,whichcausessomeoverhead.Nevertheless,theperformanceresultsreportedbyHutfleszetal.[1988b]indi-catethatthesearchefficiencyofthetwingridfileiscompetitivewiththeoriginalgridfile.Althoughthetwingridfileissomewhatinferiortotheoriginalgridfileforsmallerqueryranges,thischangesforlargersearchspaces.MultidimensionalLinearHash-Unlikemultidimensionalextend-iblehashing,multidimensionallinearhashingusesnooronlyaverysmalldirectory.Itthereforeoccupiesrela-tivelylittlestoragecomparedtoextend-iblehashing,anditisusuallypossibletokeepallrelevantinformationinmainSeveraldifferentstrategieshavebeenproposedtoperformtherequiredad-dresscomputation.Earlyproposals[OukselandScheuermann1983]failedtosupportrangequeries;however,Krie-gelandSeeger[1986]laterproposedavariantoflinearhashingcalledmensionalorder-preservinglinearhash-ingwithpartialexpansionsThisstructureisbasedontheideaofpartiallyextendingthebucketswithoutexpandingthefilesizeatthesame Figure18.Twingridfile.190·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 time.Tothisend,theyuseasionalexpansionpointerreferringtothegroupofpagestobeexpandednext.Withthisstrategy,KriegelandSeegercanguaranteeamodestfilegrowth,atleastinthecaseofwell-behaveddata.Accordingtotheirexperimentalresults,MOLHPEoutperformsitscompetitorsforuniformlydistributeddata.Itfails,however,fornonuniformdistributions,mostlybecausethehashingfunctiondoesnotadaptgracefullytothegivenTosolvethisproblem,thesameau-thorslaterappliedastochastictech-nique[Burkhard1984]todeterminethesplitpoints.Becauseofthenameofthattechnique(-quantiles),theaccessmethodwascalledquantilehashing[KriegelandSeeger1987,1989].Thecriticalpropertyofthedivisioninquan-tilehashingisthattheoriginaldata,whichmayhaveanonuniformdistribu-tion,aretransformedintouniformlydistributedvaluesfor.ThesevaluesarethenusedasinputtotheMOLHPEalgorithmsforretrievalandupdate.Sincetheregionboundariesarenotnec-essarilysimplebinaryintervals,asmalldirectoryisneeded.Inexchange,skewedinputdatacanbemaintainedasefficientlyasuniformlydistributedPiecewiselinearorder-preserving(PLOP)hashingwasproposedbythesameauthorsayearlater[KriegelandSeeger1988].Becausethisstructurecanalsobeusedasanaccessmethodforextendedobjects,wedelayitsdiscus-sionuntilSection5.2.7.Anothervariantwithbetterorder-preservingpropertiesthanMOLHPEhasbeenreportedbyHutfleszetal.[1988a].Theirdynamicz-hashingusesaspace-fillingtechniquecalledz-ordering[OrensteinandMerrett1984]toguar-anteethatpointslocatedclosetoeachotherarealsostoredclosetogetheronthedisk.Z-orderingisdescribedinde-tailinSection5.1.2.Onedisadvantageofz-hashingisthatanumberofuselessdatablockswillbegenerated,asintheinterpolation-basedgridfile[Ouksel1985].Ontheotherhand,z-hashingletsthreetofourbucketsbereadinarowontheaveragebeforeaseekisrequired,whereasMOLHPEmanagestoreadonlyone[Hutfleszetal.1988a].Wid-mayer[1991]laternoted,however,thatbothz-hashingandMOLHPEareoflimiteduseinpractice,duetotheirin-abilitytoadapttodifferentdatadistri-4.2HierarchicalAccessMethodsInthissectionwediscussseveralPAMsthatarebasedonabinaryormultiwaytreestructure.ExceptfortheBANGfileandthebuddytree,whicharehybridstructures,theyperformnoaddresscomputation.Likehashing-basedmeth-ods,however,theyorganizethedatapointsinanumberofbuckets.Eachbucketusuallycorrespondstoaleafnodeofthetree(alsocalleddatanodeandadiskpage,whichcontainsthosepointslocatedinthecorrespondingbucketregion.Theinteriornodesofthetree(alsocalledindexnodes)areusedtoguidethesearch;eachofthemtypicallycorrespondstoalargersubspaceoftheuniversethatcontainsallbucketre-gionsinthesubtreebelow.Asearchoperationisthenperformedbyatop-downtreetraversal.Atthispoint,individualtreestruc-turesstilldominatethefield,althoughmoregenericconceptsaregraduallyat-tractingmoreattention.Theizedsearch(GIST)treebyHellersteinetal.[1995],forexample,attemptstosub-sumemanyofthesecommonfeaturesunderagenericarchitecture.Differencesamongindividualstruc-turesaremainlybasedonthecharac-teristicsoftheregions.Table1showsthatinmostPAMstheregionsatthesametreelevelformapartitioningoftheuniverse;thatis,theyaremutuallydisjoint,withtheirunionbeingthecom-pletespace.ForSAMsthisisnotneces-sarilytrue;asweshowinSection5,overlappingregionsandpartialcover-ageareimportanttechniquestoim-provethesearchperformanceofSAMs.MultidimensionalAccessMethods·191ACMComputingSurveys,Vol.30,No.2,June1998 Thek-d-B-Tree[Robinson1981].Thek-d-B-treecombinessomeofthepropertiesoftheadaptivek-d-tree[BentleyandFriedman1979]andtheB-tree[Comer1979]tohandlemultidi-mensionalpoints.Itpartitionstheuni-verseinthemannerofanadaptivek-d-treeandassociatestheresultingsubspaceswithtreenodes.Eachinte-riornodecorrespondstoaninterval-shapedregion.Regionscorrespondingtonodesatthesametreelevelaremutu-allydisjoint;theirunionisthecompleteuniverse.Theleafnodesstorethedatapointsthatarelocatedinthecorre-spondingpartition.LiketheB-tree,thek-d-B-treeisaperfectlybalancedtreethatadaptswelltothedistributionofthedata.OtherthanforB-trees,how-ever,nominimumspaceutilizationcanbeguaranteed.Ak-d-B-treefortherun-ningexampleissketchedinFigure19.Searchqueriesareansweredinastraightforwardmanner,analogouslytothek-d-treealgorithms.Fortheinser-tionofanewdatapoint,onefirstper-formsapointsearchtolocatetherightbucket.Ifitisnotfull,theentryisinserted.Otherwise,itissplitandabouthalftheentriesareshiftedtothenewdatanode.Variousheuristicsareavailabletofindanoptimalsplit[Rob-inson1981].Iftheparentindexnodedoesnothaveenoughspacelefttoac-commodatethenewentries,anewpageisallocatedandtheindexnodeissplitbyahyperplane.Theentriesaredis-tributedamongthetwopagesdepend-ingontheirpositionrelativetothesplittinghyperplane,andthesplitispropagatedupthetree.Thesplitoftheindexnodemayalsoaffectregionsatlowerlevelsofthetree,whichmustbesplitbythishyperplaneaswell.Be-causeofthisforcedspliteffect,itisnotpossibletoguaranteeaminimumstor-ageutilization.Deletionisstraightforward.Afterper-forminganexactmatchquery,theentryisremoved.Ifthenumberofentriesdropsbelowagiventhreshold,thedatanodemaybemergedwithasiblingdatanodeaslongastheunionremainsa-dimensionalinterval.Theproceduretofindasuitablesiblingnodetomergewithmayinvolveseveralnodes.Theunionofdatapagesresultsinthedele-tionofatleastonehyperplaneinthe Figure19.192·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 parentindexnode.Ifanunderflowoc-curs,thedeletionhastobepropagatedupthetree.TheLSD-Tree[Henrichetal.WelisttheLSD(LocalSplitDe-cision)treeasapointaccessmethodalthoughitsinventorsemphasizethatthestructurecanalsobeusedforman-agingextendedobjects.ThisclaimisbasedonthefactthattheLSD-treeadaptswelltodatathatarenonuni-formlydistributedandthatitisthere-forewell-suitedforuseinconnectionwiththetransformationtechnique;amoredetaileddiscussionofthisap-proachappearsinSection5.1.1.ThedirectoryoftheLSD-treeisorga-nizedasanadaptivek-d-tree,partition-ingtheuniverseintodisjointcellsofvarioussizes.Thisresultsinabetteradaptiontothedatadistributionthanthefixedbinarypartitioning.Althoughthek-d-treemaybearbitrarilyunbal-anced,theLSD-treepreservestheex-ternalbalancingproperty;thatis,theheightsofitsexternalsubtreesdifferatmostbyone.Thispropertyismain-tainedbyaspecialpagingalgorithm.Ifthestructurebecomestoolargetofitinmainmemory,thisalgorithmidentifiessubtreesthatcanbepagedoutsuchthattheexternalbalancingpropertyispreserved.Althoughefficient,thisspe-cialpagingstrategyisobviouslyamajorimpedimentfortheintegrationoftheLSD-treeintoageneral-purposedata-basesystem.Figure20showsanLSD-treefortherunningexamplewithoneexternaldirectorypage.Asindicatedpreviously,thesplitstrategyoftheLSD-treedoesnotas-sumethedatatobeuniformlydistrib-uted.Onthecontrary,ittriestoaccom-modateskeweddatabycombiningtwosplitstrategies:Ðdata-dependent():Thechoiceofthesplitdependsonthedataandtriestoachieveamostbalancedstructure;thatis,thereshouldbeanequalnumberofobjectsonbothsidesofthesplit.Asthenameofthestruc-turesuggests,thissplitdecisionismadelocally.Ðdistribution-dependent():Thesplitisdoneatafixeddimensionandposition.Thegivendataarenottakenintoaccountbecauseanunderlying(known)distributionisassumed.Todeterminethesplitposition,onecomputesthelinearcombinationofthesplitlocationsthatwouldresultfromapplyingjustoneofthosestrategies:Thefactorisdeterminedempiricallybasedonthegivendata;itcanvaryasobjectsareinsertedanddeletedfromthetree.Henrich[1995]presentedtwoalgo-rithmstoimprovethestorageutiliza-tionoftheLSD-treebyredistributingdataentriesamongbuckets.Sincethese Figure20.MultidimensionalAccessMethods·193ACMComputingSurveys,Vol.30,No.2,June1998 strategiesmaketheLSD-treesensitivetotheinsertionsequence,thesplittingstrategymustbeadaptedaccordingly.Inordertoimprovethesearchperfor-mancefornonpointdataandrangeque-ries,HenrichandMoÈller[1995]suggeststoringauxiliaryinformationontheex-istingdataregionsalongwiththeindexentriesoftheLSD-tree.TheBuddyTree[SeegerandKriegel1990].Thebuddytreeisady-namichashingschemewithatree-structureddirectory.Thetreeiscon-structedbyconsecutiveinsertion,cuttingtheuniverserecursivelyintotwopartsofequalsizewithiso-orientedhyperplanes.Eachinteriornodespondstoa-dimensionalpartition)andtoaninterval)istheMBBofthepointsorinter-valsbelow.Partitions(andthere-foreintervals)thatcorrespondtonodesonthesametreelevelaremutu-allydisjoint.Asinalltree-basedstruc-tures,theleavesofthedirectorypointtothedatapages.Otherimportantpropertiesofthebuddytreeinclude:(1)eachdirectorynodecontainsatleasttwoentries;(2)wheneveranodeissplit,the)and)ofthetworesultingsubnodesarere-computedtoreflectthecurrentsitu-ation;and(3)exceptfortherootofthedirectory,thereisexactlyonepointerrefer-ringtoeachdirectorypage.Duetoproperty1,thebuddytreemaynotbebalanced;thatis,theleavesofthedirectorymaybeondifferentlevels.Property2triestoachieveahighselec-tivityatthedirectorylevel.Properties1and3makesurethatthegrowthofthedirectoryremainslinear.Toavoidthedeadlockproblemofthegridfile,thebuddytreeusesk-d-trees[Orenstein1982]topartitiontheuniverse.Onlyarestrictednumberofbuddiesareadmit-ted,namely,thosethatcouldhavebeenobtainedbysomerecursivehalvingoftheuniverse.However,asshownbySeegerandKriegel[1990],thenumberofpossiblebuddiesislargerthaninthegridfileandotherstructures,whichmakesthebuddytreemoreflexibleinthecaseofupdates.ExperimentsbyKriegeletal.[1990]indicatethatthebuddytreeissuperiortoseveralotherPAMs,includingthehB-tree,theBANGfile,andthetwo-levelgridfile.AbuddytreefortherunningexampleisshowninFigure21.Twoolderstructures,thetion-basedgridfilebyOuksel[1985]andthebalancedmultidimensionalex-tendiblehashtreebyOtoo[1986],arebothspecialcasesofthebuddytreethatcanbeobtainedbyrestrictingtheprop-ertiesoftheregions.Interpolation-basedgridfilesavoidtheexcessivegrowthofthegridfiledirectorybyrep-resentingblocksexplicitly,whichguar-anteesthatthereisonlyonedirectoryentryforeachdatabucket.Thedisad-vantageofthisapproachisthatemptyregionshavetobeintroducedinthecaseofskeweddatainput.Seeger[1991]latershowedthatthebuddytreecaneasilybemodifiedtohandlespa-tiallyextendedobjectsbyusingoneofthetechniquespresentedinSection5.TheBANGFile[FreestonToobtainabetteradaptiontothegivendatapoints,Freeston[1987]proposedanewstructure,whichhecalledtheBANG(BalancedAndNestedGrid)fileÐeventhoughitdiffersfromthegridfileinmanyaspects.Similartothegridfile,itpartitionstheuniverseintointervals(boxes).Whatisdifferent,however,isthatintheBANGfilebucketregionsmayintersect,whichisnotpossibleintheregulargridfile.Inparticular,onecanformnonrectangularbucketregionsbytakingthegeometricdifferenceoftwoormoreintervals(nesting).Toincreasestorageutiliza-tion,itispossibleduringinsertiontoredistributepointsbetweendifferentbuckets.Tomanagethedirectory,theBANGfileusesabalancedsearchtreestructure.Incombinationwiththehash-basedpartitioningoftheuniverse,194·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 theBANGfilecanthereforebeviewedasahybridstructure.Figure22showstheBANGfilefortherunningexample.ThreerectangleshavebeencutoutoftheuniverseR1:R2,R5,andR6.Inturn,therectanglesR3andR4arenestedintoR2andR5,respec-tively.Ifonerepresentstheresultingspacepartitioningasatreeusingbitinterleaving,oneobtainsthestructureshownontheright-handsideofFigure22.Heretheasteriskrepresentstheemptystring,thatis,theuniverse.AcomparisonwithFigure13showsthattheBANGfilecaninfactberegardedasapaginatedversionoftheBD-treedis-cussedinSection3.2.3.Inordertoachieveahighstorageutilization,theBANGfileperformsspanningsplitsthatmayleadtothedisplacementofpartsofthetree.Asaresult,apointsearchmayintheworstcaserequirethetraversaloftheentiredirectoryinadepth-firstmanner.Toaddressthisproblem,Freeston[1989a]laterproposeddifferentsplittingstrate-gies,includingforcedsplitsasusedbythek-d-B-tree.Thesestrategiesavoidthespanningproblematthepossibleexpenseoflowerstorageutilization.Ku-mar[1994a]madeasimilarproposalbasedontheBD-treeandcalledtheresultingstructurea(gridtree).ThestructurediffersfromtheBD-treeinthewaythepartitionsaremappedintobuckets.Toobtainasimplermap-ping,theG-treesacrificestheminimumstorageutilizationthatholdsfortheAlthoughthedatapartitioninggiveninFigure22isfeasiblefortheBD-treeandtheoriginalBANGfile,itcannotbeachievedwiththeBANGfileusingforcedsplits[Freeston1989a].Forthisvariant,wewouldhavetosplittherootandmove,forexample,entryc5tothebucketcontainingtheentriesp7andc6.Freeston[1989b]alsoproposedanex-tensiontotheBANGfiletohandleex-tendedobjects.AsoftenfoundinPAMextensions,thecentroidisusedtodeter-minethebucketinwhichtoplaceagivenobject.Toaccountfortheobject'sspatialextension,thebucketregionsareextendedwherenecessary[SeegerandKriegel1988;Ooi1990]. Figure21.Buddytree.MultidimensionalAccessMethods·195ACMComputingSurveys,Vol.30,No.2,June1998 OukselandMayer[1992]proposedanaccessmethodcalledanestedinterpola-tion-basedgridfilethatiscloselyre-latedtotheBANGfile.Themajordif-ferenceconcernsthewaythedirectoryisorganized.Inessence,thedirectoryconsistsofalistofone-dimensionalac-cessmethods(e.g.,B-trees)storingthez-orderencodingofthedifferentdataregions,alongwithpointerstothere-spectivedatabuckets.Bydoingso,Ouk-selandMayerimprovedtheworst-caseboundsfrom)(asinthecaseoftheBANGfile)to(log),wherebucketsize.ThehB-Tree[LometandSalz-berg1989,1990].ThehB-tree(holeybricktree)isrelatedtothek-d-B-treeinthatitutilizesk-d-treestoorganizethespacerepresentedbyitsinteriornodes.Oneofthemostnoteworthydifferencesisthatnodesplittingisbasedonmulti-pleattributes.Asaresult,nodesnolongercorrespondto-dimensionalin-tervalsbuttointervalsfromwhichsmallerintervalshavebeenexcised.SimilartotheBANGfile,theresultisasomewhatfractalstructure(a)withanexternalenclosingregionandseveralcavitiescalledextractedre-gions.Asweshowlater,thistechniqueavoidsthecascadingofsplitsthatistypicalformanyotherstructures.Inordertominimizeredundancy,thek-d-treecorrespondingtoaninteriornodecanhaveseveralleavespointingtothesamechildnode.Strictlyspeaking,thehB-treeisthereforenolongeratreebutadirectedacyclicgraph.Withre-gardtothegeometry,thiscorrespondstotheunionofthecorrespondingre-gions.Onceagain,theresultingregionistypicallynolongerbox-shaped.ThispeculiarityisillustratedinFigure23,whichshowsanhB-treefortherunningexample.Heretherootnodecontainstwopointerstoitsleftdescendantnode.Itscorrespondingregionistheunionoftworectangles:theonetotheleftof1andtheoneabove1.Theremainingspace(therightlowerquadrant)isex-cludedfrom,whichismadeexplicitbytheentryinthecorrespondingk-d-tree.Asimilarobservationappliesto,whichisagainL-shaped:itcorrespondstotheNW,theSE,andtheNEquadrantsoftherectangleaboveSearchingissimilartothek-d-B-tree;eachinternalk-d-treeistraversedasusual.Insertionsarealsocarriedoutanalogouslytothek-d-B-treeuntilaleafnodereachesitscapacityandasplitisrequired.Insteadofusingjustonesinglehyperplanetosplitthenode,thehB-treesplitisbasedonmorethanoneattributeandontheinternalk-d-treeofthedatanodetobesplit.LometandSalzberg[1989]showthatthispolicyguaranteesaworst-casedatadistribu-tionbetweenthetworesultingtwonodesof 3:2 .ThisobservationisnotrestrictedtothehB-treebutgeneralizestootheraccessmethodssuchastheBD-treeandtheBANGfile.Thesplitoftheleafnodecausestheintroductionofanadditionalk-d-tree Figure22.BANGfile.196·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 nodetodescribetheresultingsubspace.Thismayinturnleadtothesplitoftheancestornodeanditsk-d-tree.Sincek-d-treesarenotheight-balanced,split-tingthetreeatitsrootmayleadtoanunbalanceddistributionofthenodes.Thetreeisthereforeusuallysplitatalowerlevel,whichcorrespondstotheexcisionofaconvexregionfromthespacecorrespondingtothenodetobesplit.Theentriesbelongingtothatsub-spaceareextractedandmovedtoanewhB-treenode.Toreflecttheabsenceoftheexcisedregion,thehB-treenodeisassignedanexternalmarker,whichin-dicatesthattheregionisnolongerasimpleinterval.Withthistechniquetheproblemofforcedsplitsisavoided.Splitsarelocalanddonothavetobepropagateddownwards.Insummary,theleafnodesoftheinternalk-d-treesareusedtoÐreferenceacollectionofdatarecords;ÐreferenceotherhB-treenodes;Ðindicatethatapartofthistreehasbeenextracted.InalaterPh.D.thesis[Evangelidis1994],thehB-treeisextendedtoallowforconcurrencyandrecoverybymodify-ingitinsuchawaythatitbecomesaspecialcaseofthe-tree[LometandSalzberg1992].Consequently,thenewstructureiscalledthethegelidisetal.1995].Asaresultofthesemodifications,thenewstructurecanimmediatelytakeadvantageofthe-treenodeconsolidationalgorithm.ThelackofsuchanalgorithmhasbeenoneofthemajorweaknessesofthehB-tree.Furthermore,thehB-treecor-rectsaflawinthesplitting/postingal-gorithmofthehB-treethatmayoccurformorethanthreeindexlevels.Theessentialideaofthecorrectionistoimposerestrictionsonthesplitting/postingalgorithms,whichinturnaf-fectsthespaceoccupancy.Oneminorproblemremains:asmen-tioned,thehB-treemaystoreseveralreferencestothesamechildnode.Thenumberofnodesmayinprincipleex-poseagrowthbehaviorthatissuperlin-earinthenumberofregions;however,thisobservationseemsofmainlytheo-reticalinterest.Accordingtotheau-thorsofthehB-tree[Evangelidisetal.1995],itisquiterarethatmorethanoneleafoftheunderlyingk-dtreereferstoanygivenchild.Intheirexperiments,morethan95%oftheindexnodesandallofthedatanodeshadonlyonesuchTheBV-Tree[Freeston1995].TheBV-treerepresentsanattempttosolvethe-dimensionalB-treeproblem,thatis,tofindagenericgeneralizationoftheB-treetohigherdimensions.TheBV-treeisnotmeanttobeaconcreteaccessmethod,butratheraconceptualframeworkthatcanbeappliedtoava-rietyofexistingaccessmethods,includ-ingtheBANGfileorthehB-tree.Freeston'sproposalisbasedonthe Figure23.MultidimensionalAccessMethods·197ACMComputingSurveys,Vol.30,No.2,June1998 conjecturethatonecanmaintainthemajorstrengthsoftheB-treeinhigherdimensions,providedonerelaxesthestrictrequirementsconcerningtreebalanceandstorageutilization.TheBV-treeisnotcompletelybalanced.Fur-thermore,althoughtheB-treeguaran-teesaworst-casestorageutilizationof50%,Freestonarguesthatsuchacom-parativelyhighstorageutilizationcan-notbeensuredforhigherdimensionsfortopologicalreasons.However,theBV-treemanagestoachievethe33%lowerboundsuggestedbyLometandSalzberg[1989].Toachieveaguaranteedworst-casesearchperformance,theBV-treecom-binestheexcisionconcept[Freeston1987]withatechniquecalled.Here,intervalsfromlowerlevelsofthetreearemovedupthetree,thatis,closertotheroot.Tokeeptrackoftheresultingchanges,witheachpromotedregionwestorealevelnumber(calleda)thatdenotestheregion'soriginalThesearchalgorithmsarebasedonanotionalbacktrackingtechnique.Whiledescendingthetree,westorepossiblealternatives(relevantguardsofthedif-ferentindexlevels)inaguardset.Theentriesofthissetactasbacktrackingpointsandrepresentasinglepathfromtheroottothelevelcurrentlyinspected;forpointqueries,theycanbemain-tainedasastack.Toanswerapointquery,westartattherootandinspectallnodeentriestoseewhetherthecor-respondingregionsoverlapthesearchpoint.Amongthoseentriesinspected,wechoosethebest-matchingentrytoinvestigatenext.Wemaypossiblyalsostoresomeguardsintheguardset.Atthenextlevelthisprocedureisrepeatedrecursively,thistimetakingthestoredguardsintoaccount.Beforefollowingthebest-matchingentrydowntothenextlevel,theguardsetisupdatedbymergingthematchingnewguardswiththeexistingones.Twoguardsatthesamelevelaremergedbydiscardingthepoorermatch.Thissearchcontinuesre-cursivelyuntilwereachtheleaflevel.Notethatforpointqueries,thelengthofthesearchpathisequaltotheheightoftheBV-treebecauseeachregioninspaceisrepresentedbyauniquenodeFigure24showsaBV-treeandthecorrespondingspacepartitioningfortherunningexample.Forillustrationpur-posesweconfinethegroupedregionsorobjectsnotbyatightpolyline,butbyalooselywrappedboundary.Inthisex-ample,theregionD0actsasaguard.ItisclearfromthespacepartitioningthatD0originallybelongstothebottomin-dexlevel(i.e.,themiddlelevelinthefigure).SinceitfunctionsasaguardfortheenclosedregionS1,however,ithasbeenpromotedtotherootlevel.Sup-poseweareinterestedinallobjectsintersectingtheblackrectangleX.Startingattheroot,weplaceD0intheguardsetandinvestigateS1.BecauseinspectionofS1revealsthatthesearch Figure24.198·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 regionisincludedneitherinP0norinN0orM0,webacktracktoD0andin-specttheentriesforD0.Inourexample,noentrysatisfiesthequery.Inalaterpaper,Freeston[1997]dis-cussescomplexityissuesrelatedtoup-datesofguards.Inthepresenceofsuchupdates,itisnecessarytoªdowngradeº(demote)entriesthatarenolongerguards,whichmayinturnaffecttheoverallstructurenegatively.Freeston'sconclusionisthatthelogarithmicaccessperformanceandtheminimumstorageutilizationoftheBV-treecanbepre-servedbypostponingthedemotionofsuchentries,whichmayleadto(very)largeindexnodes.4.3Space-FillingCurvesforPointDataWealreadymentionedthemainreasonwhythedesignofmultidimensionalac-cessmethodsissodifficultcomparedtotheone-dimensionalcase:Thereisnototalorderthatpreservesspatialprox-imity.Onewayoutofthisdilemmaistofindheuristicsolutions,thatis,tolookfortotalordersthatpreservespatialproximityatleasttosomeextent.Theideaisthatiftwoobjectsarelocatedclosetogetherinoriginalspace,thereshouldatleastbeahighprobabilitythattheyareclosetogetherinthetotalorder,thatis,intheone-dimensionalimagespace.Fortheorganizationofthistotalorderonecouldthenuseaone-dimensionalaccessmethod(suchas-tree),whichmayprovidegoodper-formanceatleastforpointqueries.Rangequeriesaresomewhatmorecom-plicated;asimplemappingfrommulti-dimensionaltoone-dimensionalrangequeriesoftenimpliesmajorperformancepenalties.TropfandHerzog[1981]presentamoresophisticatedandeffi-cientalgorithmforthisproblem.Researchontheunderlyingmappingproblemgoesbackwellintothelastcentury;seeSagan[1994]forasurvey.Withregardtoitsrelevanceforspatialsearching,Samet[1990b]providesagoodoverviewofthesubject.Onethingallproposalshaveincommonisthattheyfirstpartitiontheuniversewithagrid.Eachofthegridcellsislabeledwithauniquenumberthatdefinesitspositioninthetotalorder(thespace-fillingcurve).Thepointsinthegivendatasetarethensortedandindexedaccordingtothegridcellinwhichtheyarecontained.Notethatalthoughthelabelingisindependentofthegivendata,itisobviouslycriticalforthepres-ervationofproximityinone-dimen-sionaladdressspace.Thatis,thewaywelabelthecellsdetermineshowclus-teredadjacentcellsarestoredonsec-ondarymemory.Figure25showsfourcommonlabel-ings.Figure25acorrespondstoarow-wiseenumerationofthecells[Samet1990b].Figure25bshowsthecellenu-merationimposedbythePeanocurve[Morton1966],alsocalledquadcodes[FinkelandBentley1974],N-trees[White1981],locationalcodes[AbelandSmith1983],orz-ordering[OrensteinandMerrett1984].Figure25cshowstheHilbertcurve[FaloutsosandRose-man1989;Jagadish1990a],andFigure25ddepictsGrayordering[Faloutsos1986,1988],whichisobtainedbyinter-leavingtheGraycodesofthe-and-coordinatesinabitwisemanner.Graycodesofsuccessivecellsdifferinexactlyonebit.Basedonseveralexperiments,AbelandMark[1990]concludethatz-order-ingandtheHilbertcurvearemostsuit-ableasmultidimensionalaccessmeth-ods.Jagadish[1990a]andFaloutsosandRong[1991]allprefertheHilbertcurveofthosetwo.Z-orderingisoneofthefewspatialaccessmethodsthathasfounditswayintocommercialdatabaseproducts.Inparticular,Oracle[1995]hasadaptedthetechniqueandoffereditforsometimeasaproduct.Animportantadvantageofallspace-fillingcurvesisthattheyarepracticallyinsensitivetothenumberofdimensionsiftheone-dimensionalkeyscanbearbi-trarilylarge.Everythingismappedintoone-dimensionalspace,andone'sfavor-iteone-dimensionalaccessmethodcanMultidimensionalAccessMethods·199ACMComputingSurveys,Vol.30,No.2,June1998 beappliedtomanagethedata.Anobvi-ousdisadvantageofspace-fillingcurvesisthatincompatibleindexpartitionscannotbejoinedwithoutrecomputingthecodesofatleastoneofthetwo5.SPATIALACCESSMETHODSAllmultidimensionalaccessmethodspresentedintheprevioussectionhavebeendesignedtohandlesetsofdatapointsandsupportspatialsearchesonthem.Noneofthosemethodsisdirectlyapplicabletodatabasescontainingob-jectswithaspatialextension.Typicalexamplesincludegeographicdatabases,containingmostlypolygons,ormechan-icalCADdata,consistingofthree-di-mensionalpolyhedra.Inordertohandlesuchextendedobjects,pointaccessmethodshavebeenmodifiedusingoneofthetechniques:(1)transformation(objectmapping),(2)overlappingregions(objectbound-(3)clipping(objectduplication),or(4)multiplelayers.AsimplerversionofthisclassificationwasfirstintroducedbySeegerandKrie-gel[1988].Lateron,Kriegeletal.[1991]addedanotherdimensiontothistaxonomy:aspatialaccessmethod'sbasetype,thatis,thespatialdatatypeitsupportsprimarily.Table2showstheresultingclassificationofspatialaccessmethods.Notethatmoststructuresusetheintervalasabasetype.Inthefollowingsections,wepresenteachofthesetechniquesindetail,to-getherwithseveralSAMsbasedonit.5.1TransformationOne-dimensionalaccessmethods(Sec-tion3.1)andPAMs(Section4)canoftenbeusedtomanagespatiallyextendedobjects,providedtheobjectsarefirsttransformedintoadifferentrepresenta-tion.Thereareessentiallytwooptions:onecaneithertransformeachobjectintoahigher-dimensionalpoint[Hin-richs1985;SeegerandKriegel1988],ortransformitintoasetofone-dimen-sionalintervalsbymeansofspace-fill-ingcurves.Wediscussthetwotech-niquesinturn. Figure25.Fourspace-fillingcurves.200·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 MappingtoHigher-DimensionalSimplegeometricshapescanberepresentedaspointsinhigher-dimen-sionalspace.Forexample,ittakesfourrealnumberstorepresenta(two-dimen-sional)rectanglein.Thosenumbersmaybeinterpretedascoordinatesofapointin.Onepossibilityistotake-and-coordinatesoftwodiagonalcorners(endpointtransformation);an-otheroptionisbasedonthecentroidandtwoadditionalparametersfortheextensionoftheobjectinthe-and-direction(midpointtransformationAnysuchtransformationmapsadata-baseofrectanglesontoadatabaseoffour-dimensionalpoints,whichcanthenbemanagedbyoneofthePAMsdis-cussedintheprevioussection.Searchoperationscanbeexpressedaspointandregionqueriesinthisdualspace.Iftheoriginaldatabasecontainsmorecomplexobjects,theyhavetobeapprox-imatedÐforexample,byarectangleorasphereÐbeforetransformation.Inthiscase,thepointaccessmethodcanleadtoonlyapartialsolution(cf.Figure2).Figures26and27showthedualspaceequivalentsofsomecommonque-ries.Figure26usestheendpointtrans-formationandFigure27themidpointtransformation.Forpresentationpur-poses,thefigureshowsamappingfromintervalsintopointsin.Figures26aand27ashowthetransformationresultfortherangequerywithsearchrange[].Indualspacethisrangequerymapsintoageneralregionquery.Anypointindualspacethatliesintheshadedareascorrespondstoanintervalinoriginalspacethatoverlapsthesearchinterval[],andviceversa.Enclosureandcontainmentquerieswiththeinterval[]asargumentalsomapintogeneralregionqueries(Figures26band27b).Apointquery,finally,mapsintoarangequeryfortheendpointtransformation(Fig.26c)andageneralregionqueryforthemidpointtransformation(Fig.27c).Notwithstandingitsconceptualele-gance,thisapproachhasseveraldisad-vantages.First,astheprecedingexam-plesindicate,theformulationofpointandrangequeriesindualspaceisusu-allymuchmorecomplicatedthaninoriginalspace[NievergeltandHinrichs1987].Finitesearchregionsmaymapintoinfinitesearchregionsindualspace,andsomemorecomplexqueriesinvolvingspatialpredicatesmaynolongerbeexpressibleatall[HenrichetTable2.ClassificationofSAMs[Kriegeletal.1991] MultidimensionalAccessMethods·201ACMComputingSurveys,Vol.30,No.2,June1998 al.1989;Orenstein1990;Pageletal.1993].Second,dependingonthemap-pingchosen,thedistributionofpointsindualspacemaybehighlynonuniformeventhoughtheoriginaldataareuni-formlydistributed.Withtheendpointtransformation,forexample,therearenoimagepointsbelowthemaindiago-nal[Faloutsosetal.1987].Third,theimagesoftwoobjectsthatarecloseintheoriginalspacemaybearbitrarilyfarapartfromeachotherindualspace.Toovercomesomeoftheseproblems,Henrichetal.[1989],FaloutsosandRong[1991],aswellasPageletal.[1993]haveproposedspecialtransfor-mationandsplitstrategies.Astructuredesignedexplicitlytobeusedinconnec-tionwiththetransformationtechniqueistheLSD-tree(cf.Section4.2.2).Per-formancestudiesbyHenrichandSix[1991]confirmtheclaimthattheLSD-treeadaptswelltononuniformdistribu-tions,whichisofparticularrelevanceinthiscontext.Italsocontainsamecha-nismtoavoidsearchinglargeemptyqueryspaces,whichmayoccurasaresultofthetransformation.Space-FillingCurvesforEx-tendedObjects.Space-fillingcurves(cf.Section4.3)areaverydifferenttypeoftransformationapproachthatseemstohavefewerofthedrawbackslistedintheprevioussection.Space-fillingcurvescanbeusedtorepresentex-tendedobjectsbyalistofgridcellsor,equivalently,alistofone-dimensionalintervalsthatdefinethepositionofthegridcellsconcerned.Inotherwords,acomplexspatialobjectisapproximatednotbyonlyonesimplerobject,butbytheunionofseveralsuchobjects.Therearedifferentvariationsofthisbasicconcept,includingz-ordering[OrensteinandMerrett1984],theHilbertR-tree[KamelandFaloutsos1994],andtheUB-tree[Bayer1996].Asanexample,wediscussz-orderinginmoredetail. Figure26.SearchqueriesindualspaceÐendpointtransformation:(a)intersectionquery;(b)contain-ment/enclosurequeries;(c)pointquery. Figure27.SearchqueriesindualspaceÐmidpointtransformation:(a)intersectionquery;(b)contain-ment/enclosurequeries;(c)pointquery.202·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 ForadiscussionoftheHilbertR-tree,seeSection5.2.1.[OrensteinandMerrett1984]isbasedonthePeanocurve.Asimplealgorithmtoobtainthez-order-ingrepresentationofagivenextendedobjectcanbedescribedasfollows.Start-ingfromthe(fixed)universecontainingthedataobject,spaceissplitrecur-sivelyintotwosubspacesofequalsizeby(1)-dimensionalhyperplanes.Asinthek-d-tree,thesplittinghyper-planesareiso-oriented,andtheirdirec-tionsalternateinfixedorderamongthepossibilities.Thesubdivisioncontin-uesuntiloneofthefollowingconditions(1)Thecurrentsubspacedoesnotover-lapthedataobject.(2)Thecurrentsubspaceisfullyen-closedinthedataobject.(3)Somegivenlevelofaccuracyhasbeenreached.Thedataobjectisthusrepresentedbyasetofcells,calledPeanoregions.AsshowninSection3.2.3,eachsuchPeanoregioncanberepre-sentedbyauniquebitstring,calledPeanocode.Usingthosebitstrings,thecellscanthenbestoredinastandardone-dimensionalindex,suchasaBFigure28showsasimpleexample.Figure28ashowsthepolygontobeap-proximated,withtheframerepresent-ingtheuniverse.Afterseveralsplits,startingwithaverticalsplitline,weobtainFigure28b.NinePeanoregionsofdifferentshapesandsizesapproxi-matetheobject.ThelabelingofeachPeanoregionisshowninFigure28c.ConsiderthePeanoregioninthelowerleftpartofthegivenpolygon.Itliestotheleftofthefirstverticalhyper-planeandbelowthefirsthorizontalhy-perplane,resultinginthefirsttwobitsbeing00.Aswefurtherpartitionthelowerleftquadrant,liesontheleftofthesecondverticalhyperplanebutabovethesecondhorizontalhyperplane.Thecompletebitstringaccumulatedsofaristherefore0001.Inthenextroundofdecompositions,liestotherightofthethirdverticalhyperplaneandabovethethirdhorizontalhyperplane,result-ingintwoadditional1s.ThecompletebitstringdescribingisthereforeFigures28band28calsogivesomebitstringsalongthecoordinateaxes,whichdescribeonlythesplitsorthogo-naltothegivenaxis.Thestring01on-axis,forexample,describesthesubspacetotheleftofthefirstverticalsplitandtotherightofthesecondver-ticalsplit.Bybit-interleavingthebitstringsthatonefindswhenprojectingaPeanoregionontothecoordinateaxes,weobtainitsPeanocode.NotethatifaPeanocodeistheprefixofsomeotherPeanocode,thePeanoregioncorre-spondingtoenclosesthePeanore- Figure28.Z-orderingofapolygon.MultidimensionalAccessMethods·203ACMComputingSurveys,Vol.30,No.2,June1998 gioncorrespondingto.ThePeanore-gioncorrespondingto00,forexample,enclosestheregionscorrespondingto0001and000.Thisisanimportantob-servation,sinceitcanbeusedforqueryprocessing[GaedeandRiekert1994].Figure29showsPeanoregionsfortherunningexample.Asz-orderingisbasedonanunderly-inggrid,theresultingsetofPeanore-gionsisusuallyonlyanapproximationoftheoriginalobject.Theterminationcriteriondependsontheaccuracyor(maximumnumberofbits)desired.MorePeanoregionsobviouslyyieldmoreaccuracy,buttheyalsoin-creasethesizeandcomplexityoftheapproximation.AspointedoutbyOren-stein[1989b],therearetwopossiblyconflictingobjectives:thenumberofPeanoregionstoapproximatetheobjectshouldbesmall,sincethisresultsinfewerindexentries;andtheaccuracyoftheapproximationshouldbehigh,sincethisreducestheexpectednumberoffalsedrops[Orenstein1989a,b;Gaede1995b].Objectsarethuspagedinfromsecondarymemory,onlytofindoutthattheydonotsatisfythesearchpredicate.AsimplewaytoreducethenumberoffalsedropsistoaddasinglebittotheencodingthatreflectsforeachPeanoregionwhetheritiscompletelyenclosedintheoriginalobject[Gaede1995a].Anadvantageofz-orderingisthatlocalchangesofgranularityleadtoonlylocalchangesofthecorrespondingencoding.5.2OverlappingRegionsThekeyideaoftheoverlappingregionstechniqueistoallowdifferentdatabucketsinanaccessmethodtocorre-spondtomutuallyoverlappingsub-spaces.Withthismethodwecanassignanyextendedobjectdirectlyandasawholetoonesinglebucketregion.Con-sider,forinstance,thek-d-B-treefortherunningexample,depictedinFigure19,andoneofthepolygonsgiveninthescenario(Figure9),sayr10.r10over-lapstwobucketregions,theonecon-tainingp10,c1,andc2,andtheotheronecontainingc10andp9.Ifweextendoneofthoseregionstoaccommodater10,thispolygoncouldbestoredinthe Figure29.204·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 correspondingbucket.Note,however,thatthisextensioninevitablyleadstoanoverlapofregions.Searchalgorithmscanbeappliedal-mostunchanged.Theonlydifferencesareduetothefactthattheoverlapmayincreasethenumberofsearchpathswehavetofollow.Evenapointquerymayrequiretheinvestigationofmultiplesearchpathsbecausetheremaybesev-eralsubspacesatanyindexlevelthatincludethesearchpoint.Forrangeandregionqueries,theaveragenumberofsearchpathsincreasesaswell.Hence,althoughfunctionalityisnotaproblemwhenusingoverlappingre-gions,performancecanbe.Thisispar-ticularlyrelevantwhenthespatialda-tabasecontainsobjectswhosesizeislargerelativetothesizeoftheuniverse.Typicalexamplesareknownfromgeo-graphicapplicationswhereonemustrepresentobjectsofwidelyvaryingsize(suchasbuildingsandstates)inthesamespatialdatabase.Eachinsertionofanewdataobjectmayincreasetheoverlapandthereforetheaveragenum-berofsearchpathstobetraversedperquery.Eventually,theoverlapbetweensubspacesmaybecomelargeenoughtorendertheindexineffectivebecauseoneendsupsearchingmostoftheindexforeachsinglequery.Awell-knownexam-plewherethisdegeneratebehaviorhasbeenobservedistheR-tree[Guttman1984;Greene1989].Severalmodifica-tionshavebeenpresentedtomitigatetheseproblems,includingatechniquetominimizetheoverlap[RoussopoulosandLeifker1985];seeSection5.2.1foradetaileddiscussion.Aminorproblemwithoverlappingre-gionsconcernsambiguitiesduringin-sertion.Ifweinsertanewobject,wecouldinprincipleenlargeanysubspacetoaccommodateit.Tooptimizeperfor-mance,thereexistseveralstrategies[Pageletal.1993].Forexample,wecouldtrytofindthesubspacethatcausesminimaladditionaloverlap,ortheonethatrequirestheleastenlarge-ment.Ifittakestoolongtocomputetheoptimalstrategyforeveryinsertion,someheuristicmaybeused.Whenasubspaceneedstobesplit,onealsotriestofindasplitthatleadstominimaloveralloverlap.Guttman[1984],Greene[1989],andBeckmannetal.[1990]suggestsomeheuristicsforthisproblem.TheR-Tree[Guttman1984].R-treecorrespondstoahierarchyof-dimensionalintervals(boxes).EachnodeoftheR-treecorrespondstoadiskpageanda-dimensionalinter-).Ifisaninteriornodethentheintervalscorrespondingtothede-arecontainedinIntervalsatthesametreelevelmayoverlap.Ifisaleafnode,)isthe-dimensionalminimumboundingboxoftheobjectsstoredin.Foreachobjectinturn,storesonlyitsMBBandareferencetothecompleteobjectdescrip-tion.OtherpropertiesoftheR-treein-cludethefollowing.ÐEverynodecontainsbetweenentriesunlessitistheroot.Thelowerboundpreventsthedegener-ationoftreesandensuresanefficientstorageutilization.Wheneverthenumberofanode'sdescendantsdrops,thenodeisdeletedanditsdescendantsaredistributedamongthesiblingnodes(treecondensationTheupperboundcanbederivedfromthefactthateachtreenodecor-respondstoexactlyonediskpage.ÐTherootnodehasatleasttwoentriesunlessitisaleaf.ÐTheR-treeisheight-balanced;thatis,allleavesareatthesamelevel.TheheightofanR-treeisatmostindexrecords(SearchingintheR-treeissimilartotheB-tree.Ateachindexnode,allindexentriesaretestedtoseewhethertheyintersectthesearchinterval.Wethenvisitallchildnodes.Duetotheoverlappingregionparadigm,theremaybeseveralinter-)thatsatisfythesearchpredi-cate.Intheworstcase,onemayhavetoMultidimensionalAccessMethods·205ACMComputingSurveys,Vol.30,No.2,June1998 visiteveryindexpage.Figure30showsanR-treefortherunningexample.Re-memberthatthedenotetheMBBsofthepolygonaldataobjects.ApointquerywithsearchpointXresultsintwopaths:R8m7andR7BecausetheR-treeonlymanagesMBBs,itcannotsolveagivensearchproblemcompletelyunless,ofcourse,theactualdataobjectsareinterval-shaped.OtherwisetheresultofanR-treequeryisasetofcandidateobjects,whoseactualspatialextenthastobetestedforintersectionwiththesearchspace(cf.Fig.2).Thisstep,whichmayinvolveadditionaldiskaccessesandconsiderablecomputation,hasnotbeentakenintoaccountinmostpublishedperformanceanalyses[Guttman1984;Greene1989].Toinsertanobject,weinserttheminimumboundinginterval)andanobjectreferenceintothetree.Incon-trasttosearching,wetraverseonlyasinglepathfromtheroottotheleaf.Ateachlevelwechoosethechildnodewhosecorrespondingintervalneedstheleastenlargementtoenclosethedataobject'sinterval).Ifsev-eralintervalssatisfythiscriterion,Guttmanproposesselectingthedescen-dantassociatedwiththesmallestinter-val.Asaresult,weinserttheobjectonlyonce;thatis,theobjectisnotdis-persedoverseveralbuckets.Oncewehavereachedtheleaflevel,wetrytoinserttheobject.Ifthisrequiresanenlargementofthecorrespondingbucketregion,weadjustitappropri-atelyandpropagatethechangeup-wards.Ifthereisnotenoughspaceleftintheleaf,wesplititanddistributetheentriesamongtheoldandthenewpage.Onceagain,weadjusteachofthenewintervalsaccordinglyandpropagatethesplitupthetree.Asfordeletion,wefirstperformanexactmatchqueryfortheobjectinquestion.Ifwefinditinthetree,wedeleteit.Ifthedeletioncausesnoun-derflow,wecheckwhethertheboundingintervalcanbereducedinsize.Ifso,weperformthisadjustmentandpropagateitupwards.Ifthedeletioncausesnodeoccupationtodropbelow,however,wecopythenodecontentintoatempo-rarynodeandremoveitfromtheindex.Wethenpropagatethenoderemovalupthetree,whichtypicallyresultsintheadjustmentofseveralboundinginter-vals.Afterwardswereinsertallor-phanedentriesofthetemporarynode.Alternatively,wecanmergetheor-phanedentrieswithsiblingentries.Inbothcases,onemayagainhavetoad-justboundingintervalsfurtheruptheInhisoriginalpaper,Guttman[1984]discussesvariouspoliciestominimizetheoverlapduringinsertion.Fornodesplitting,forexample,Guttmansug-gestsseveralalgorithms,includingasimpleronewithlineartimecomplexityandamoreelaborateonewithqua-draticcomplexity.Laterworkbyother Figure30.206·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 researchersledtothedevelopmentofmoresophisticatedpolicies.The[RoussopoulosandLeifker1985],forexample,computesanoptimalparti-tioningoftheuniverseandacorre-spondingminimalR-treeforagivenscenario.However,itrequiresalldatatobeknownapriori.OtherinterestingvariantsoftheR-treeincludethespheretreebyOosterom[1990]andtheHilbertR-treebyKamelandFaloutsos[1994].Thespheretreecorrespondstoahierarchyofnested-dimensionalspheresratherthanin-tervals.TheHilbertR-treecombinestheoverlappingregionstechniquewithspace-fillingcurves(cf.Section4.3).ItfirststorestheHilbertvaluesofthedatarectangles'centroidsinaBthenenhanceseachinteriorBnodebytheMBBofthesubtreebelow.Thisfacilitatestheinsertionofnewob-jectsconsiderably.Togetherwithare-visedsplittingpolicy,KamelandFa-loutsosreportgoodperformanceresultsforbothsearchesandupdates.How-ever,sincetheirsplittingpolicytakesonlytheobjects'centroidsintoaccount,theperformanceofthestructureislikelytodeteriorateinthepresenceoflargeobjects.NgandKameda[1993]discusshowtosupportconcurrencyinR-treesbyadoptingthelock-couplingtechniqueofB-trees[BayerandSchkolnick1977]toR-trees.Similarly,NgandKameda[1994]andKornackerandBanks[1995]applyideasoftheB-linktree[LehmanandYao1981]toR-trees,yieldingtwostructuresbothcalledtheR-linktree.KornackerandBanksempiricallydem-onstratethattheirR-linktreeissupe-riortotheR-treeusinglock-coupling.TheR*-Tree[Beckmannetal.BasedonacarefulstudyofR-treebehaviorunderdifferentdatadis-tributions,Beckmannetal.[1990]iden-tifiedseveralweaknessesoftheoriginalalgorithms.Inparticular,theycon-firmedtheobservationofRoussopoulosandLeifker[1985]thattheinsertionphaseiscriticalforgoodsearchperfor-mance.ThedesignoftheR*-tree(seeFigure31)thereforeintroducesapolicyforcedreinsert:Ifanodeover-flows,itisnotsplitrightaway.Rather,entriesareremovedfromthenodeandreinsertedintothetree.Thepa-mayvary;Beckmannetal.suggestitshouldbeabout30%ofthemaximalnumberofentriesperpage.AnotherissueinvestigatedbyBeck-mannetal.concernsthenode-splittingpolicy.AlthoughGuttman'sR-treealgo-rithmstriedonlytominimizetheareacoveredbythebucketregions,theR*-treealgorithmsalsotakethefollowingobjectivesintoaccount.ÐOverlapbetweenbucketregionsatthesametreelevelshouldbemini-mized.Thelessoverlap,thesmallertheprobabilitythatonehastofollowmultiplesearchpaths.ÐRegionperimetersshouldbemini-mized.Thepreferredrectangleisthe Figure31.MultidimensionalAccessMethods·207ACMComputingSurveys,Vol.30,No.2,June1998 square,sincethisisthemostcompactrectangularrepresentation.ÐStorageutilizationshouldbemaxi-TheimprovedsplittingalgorithmofBeckmannetal.[1990]isbasedontheplane-sweepparadigm[PreparataandShamos1985].Indimensions,itstimecomplexityis)foranodeInsummary,theR*-treediffersfromtheR-treemainlyintheinsertionalgo-rithm;deletionandsearchingareessen-tiallyunchanged.Beckmannetal.re-portperformanceimprovementsofupto50%comparedtothebasicR-tree.Theirimplementationalsoshowsthatreinser-tionmayimprovestorageutilization.Inbroadercomparisons,however,HoelandSamet[1992]andGuÈntherandGaede[1997]foundthattheCPUtimeoverheadofreinsertioncanbesubstan-tial,especiallyforlargepagesizes;seeSection6forfurtherdetails.OneofthemajorinsightsoftheR*-treeisthatnodesplittingiscriticalfortheoverallperformanceoftheaccessmethod.Sinceanaive(exhaustive)ap-proachhastimecomplexity)forgivenintervals,thereisaneedforefficientandoptimalsplittingpolicies.Beckeretal.[1992]proposedapolyno-mialtimealgorithmthatfindsabal-ancedsplit,whichalsooptimizesoneofseveralpossibleobjectivefunctions(e.g.,minimumsumofareasormini-mumsumofperimeters).Theyassumeintheiranalysisthattheintervalsarepresortedinsomespecificorder.Morerecently,AngandTan[1997]presentedanewlinearnodesplittingalgorithm,basedonasimpleheuristic.Accordingtotheresultsreported,itoutperformsitscompetitors.Berchtoldetal.[1996]proposedamodificationoftheR-treecalledthethatseemsparticularlywellsuitedforindexinghigh-dimensionaldata.TheX-treereducesoverlapamongdirectoryintervalsbyusinganeworganization:itpostponesnodesplittingbyintroducingsupernodes,thatis,nodeslargerthantheusualblocksize.Inordertofindasuitablesplit,theX-treealsomaintainsthehistoryofprevioussplits.TheP-Tree[Jagadish1990c].manyapplications,intervalsarenotagoodapproximationofthedataobjectsenclosed.Inordertocombinetheflexi-bilityofpolygon-shapedcontainerswiththesimplicityoftheR-tree,Jagadish[1990c]andSchiwietz[1993]indepen-dentlyproposeddifferentvariationsofpolyhedraltreesorP-trees.Todistin-guishthetwostructures,werefertotheP-treeofJagadish[1990c]astheandtotheP-treeofSchiwietz[1993]asTheJP-treefirstintroducesavariableoforientationsinthemensionaluniverse,where.Forinstance,intwodimensions(2)wemayhavefourorientations(twoparalleltothecoordinateaxes(i.e.,iso-oriented)andtwoparalleltothetwomaindiagonals.Objectsareapproxi-matedbyminimumboundingpolytopeswhosefacesareparalleltotheseentations.Clearly,thequalityoftheapproximationsispositivelycorrelated.Wecannowmaptheoriginalspaceintoan-dimensionalorientationspace,suchthateach(approximatingpolytopeturnsinto-dimensionalinterval.Anypointinside(outside)mapsontoapointinside(outside),whereastheoppositeisnotnecessarilytrue.Tomaintainthe-dimensionalintervals,alargeselectionofSAMsisavailable;Jagadish[1990c]suggeststheR-treeor-tree(cf.Section5.3.2)forthispur-AninterestingfeatureoftheJP-treeistheabilitytoaddhyperplanestotheattributespacedynamicallywithouthavingtoreorganizethestructure.Byprojectingthenewintervalsoftheex-tendedorientationspaceontotheoldorientationspace,itisstillpossibletousetheoldstructure.Consequently,wecanobtainanR-treefromahigher-dimensionalJP-treestructurebydrop-208·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 pingallhyperplanesthatarenotiso-TheinteriornodesoftheJP-treerep-resentahierarchyofnestedpolytopes,similartotheR-treeorthecelltree(cf.Section5.3.3).Polytopescorrespondingtodifferentnodesatthesametreelevelmayoverlap.Forsearchoperationswefirstcomputetheminimumboundingpolytopeofthesearchregionandmapitontoan-dimensionalinterval.ThesearchefficiencythendependsonthechosenPAM.Thesameappliesfordele-Theintroductionofadditionalhyper-planesyieldsabetterapproximation,butitincreasesthesizeoftheentries,thusreducingthefanoutoftheinteriornodes.ExperimentsreportedbyJagad-ish[1990c]suggestthata10-dimen-sionalorientationspace(10)isagoodchoiceforstoring2-dimensionallines(2)witharbitraryorientation.ThisneedstobecomparedtoasimpleMBBapproach.Althoughthelattertechniquemaysometimesrenderpoorapproximations,therepresentationre-quiresonlyfournumbersperline.Stor-inga10-dimensionalinterval,ontheotherhand,requires20numbers,thatis,fivetimesasmany.Anotherdraw-backoftheJP-treeisthefixedorienta-tionofthehyperplanes.Figure32showstherunningexampleforToovercometheproblemofpoorfil-tering,Brodskyetal.[1995]proposedmethodsforeffectivelycomputingasetofoptimalaxesforseparatingpolyhe-dra.ThisworkcontinuesthelineofworkbyJagadish[1990c]intheuseofnonstandardaxesforbetterfiltering.TheP-Tree[Schiwietz1993].P-TreeofSchiwietz,herecalledtheSP-tree,choosesaslightlydifferentap-proachforstoringpolygonalobjectsthattriestocombinetheadvantagesofthecelltreeandtheR*-treeforthetwo-dimensionalcase,whileavoidingthedrawbacksofbothmethods.Basically,theSP-treeisanR-treewhoseinteriornodescorrespondtoanestingofpoly-topesratherthanjustrectangles.Ingeneral,thenumberofvertices(andthereforethestoragerequirements)ofapolytopearenotbounded.Moreover,whenusedforapproximatingotherob-jects,theaccuracyoftheapproximationispositivelycorrelatedwiththenumberofverticesoftheapproximatingconvexpolygon.Ontheotherhand,whenusedasindexentries,thereshouldbeanupperboundinordertoguaranteeaminimumfanoutoftheinteriornodes.Todetermineareasonablygoodcompro-misebetweentheseconflictingobjec-tives,extensiveinvestigationshavebeenconductedbyBrinkhoffetal.[1993a]andSchiwietz[1993].Accordingtothesestudies,pentagonsorhexagonsseemtoofferthebesttradeoffbetweenstoragerequirementsandapproxima-tionquality.Ifnodesplittingsorinsertionsleadtoadditionalverticessuchthatsomeboundingpolygonshavemorevertices Figure32.P-tree[Jagadish1990c].MultidimensionalAccessMethods·209ACMComputingSurveys,Vol.30,No.2,June1998 thanthethreshold,thesurplusverticesareremovedonebyone.Thisleadstoalargerareaandthereforetoadecreaseofthequalityoftheapproximation.Toreduceoverlapbetweentheconvexcon-tainers,SchiwietzsuggestsusingamethodsimilartotheR*-tree.Further-more,inordertosavestoragespaceandtoimprovestorageutilization,itispos-sibletorestrictthenumberoforienta-tionsforthepolygonedges(similartotheJP-tree).Figure33showstheSP-treefortherunningexample.Toourknowledge,noperformanceresultshavebeenreportedsofarforeitherofthetwoP-trees.TheSKD-Tree[Ooietal.1987;Ooi1990].Avariantofthek-d-treeca-pableofstoringspatiallyextendedob-jectsisthespatialk-d-treeorskd-tree.Theskd-treeallowsregionstooverlap.Tokeeptrackofthemutualoverlap,westoreanupperandalowerboundwitheachdiscriminator,representingthemaximalextentoftheobjectsinthetwosubtrees.Forexample,considerthesplittinghyperplane(discriminator)hx1depictedinFigure34anditsupperandlowerboundsbx1andbx2,respectively.Thesolidlinesarethesplittinghyper-planesandthedashedlinesrepresenttheupperandlowerboundsofthecor-respondingsubtrees.m3istherectan-gleclosesttohx1withoutcrossingit,thusdeterminingthemaximumextentbx1oftheobjectsintheleft(lower)subspace.Similarly,m5determinestheminimumextentbx2fortheright(up-per)subspace.Ifnoneoftheobjectsplacedinthecorrespondingsubspacecrossesthesplittinghyperplane,thelowerboundoftheupperintervalisgreaterthanthediscriminator,andtheupperboundofthelowerintervalisless.Leafnodesofthebinarytreecontaintheminimalbounds(dottedlines)oftheobjectsinthecorrespondingdatapage.Priortoinsertinganobject,wede-termineitscentroidanditsMBB.Bycomparingthecentroidwiththestoreddiscriminators,wedeterminethenextchildtobeinspected.Notethatthereisnoambiguity.Duringinsertion,wehavetoadjusttheupperandlowerboundsforextendedobjectsaccordingly.Uponreachingthedatanodelevel,wetestwhetherthereisenoughspaceavailabletoaccommodatetheobject.Ifso,weinserttheobject;otherwisewesplitthedatanodeandinsertthenewdiscrimi-natorintotheskd-tree.Likewise,theboundsofthenewsubspacesneedtobeAsusual,searchingstartsattherootandcorrespondstoatop-downtreetra-versal.Ateachinteriornodewecheckthediscriminatorandtheboundariestodecidewhichchild(ren)tovisitnext.Deletinganobjectstartswithanex-actmatchquerytodeterminethecor-rectleafnode.Ifadeletioncausesanunderflow,weinserttheremainingen-triesintothesiblingdatanodeandre-movethesplittinghyperplane.Ifthisinsertionresultsinanoverflow,wesplit Figure33.P-tree[Schiwietz1993].210·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 thepageandinsertthenewhyperplaneintotheskd-tree.Ifnomergewithasiblingleafnodeispossible,wedeletethatleafanditsparentnode.Byredi-rectingthereferenceofthelattertoitssibling(interior)node,weextendthesubspaceofthesibling.Allaffecteden-triesarereinserted.AccordingtotheresultsreportedinOoi[1990]andOoietal.[1991],theskd-treeiscompetitivewiththeR-treebothinstorageutilizationandsearchTheGBD-Tree[OhsawaandSakauchi1990].TheGBD-tree(gener-alizedBD-tree)isanextensionoftheBD-tree[OhsawaandSakauchi1983]thatallowsforsecondarystorageman-agementandsupportsthemanagementofextendedobjects.TheBD-treeisabinarytree,buttheGBD-treeisabal-ancedmultiwaytreethatstoresspatialobjectsasahierarchyofminimumboundingboxes.Eachleafnode(bucket)storestheMBBsofthoseobjectswhosecentroidsarecontainedinthecorre-spondingbucketregion.EachinteriornodestorestheMBBofthe(usuallyoverlapping)MBBsofitsdescendants.TheintervalsareencodedusingthesameDZ-expressionsasdescribedinSection3.2.3.TheoneadvantageoftheGBD-treeovertheR-treeisthatinsertionsanddeletionsmaybeprocessedmoreeffi-ciently,duetotheencodingschemeandtheplacementbycentroid.ThelatterpointenablestheGBD-treetoperformaninsertionalongasinglepathfromtheroottoaleaf.However,noapparentadvantageisgainedinsearchperfor-mance.Thereportedperformanceex-periments[OhsawaandSakauchi1990]compareonlystorageutilizationandin-sertionperformancewiththeR-tree.Themostimportantcomparison,thatofsearchperformance,isomitted.Figure35depictsaGBD-treefortherunningexample.Thepartitioningontheleft-handsideshowstheminimumboundingboxes(dottedordashed)andtheunderlyingintervals(Peanore-AmongtheapproachessimilartotheGBD-treeareanextensionofthebuddytreebySeeger[1991]andtheextensionoftheBANGfiletohandleextendedspatialobjects[Freeston1989b].PLOP-Hashing[KriegelandSeeger1988;SeegerandKriegel1988].Piecewiselinearorder-preserving(PLOP)hashing[SeegerandKriegel1988]isavariantofhashingthatallowsthestor-ageofextendedobjectswithouttrans-formingthemintopoints.Anearlierversionofthisstructure[KriegelandSeeger1988]wasonlyabletohandlemultidimensionalpointdata.PLOP-hashingpartitionstheuniverseinasimilarwaytothegridfile:ex-tendedobjectsmayspanmorethanonedirectorycell.Hyperplanesextendalongtheaxesofthedataspace.Fortheorga-nizationofthesehyperplanes,PLOP- Figure34.MultidimensionalAccessMethods·211ACMComputingSurveys,Vol.30,No.2,June1998 hashingusesbinarytrees,wherethedimensionoftheuniverse.Eachin-teriornodeofsuchabinarytreecorre-spondstoa(1)-dimensionaliso-orientedhyperplane.Theleafnodes-dimensionalsubspacesformingslicesoftheuniverse.Figure36depictsthebinarytreesforbothaxestogetherwiththeslicesformedbythem.Byusingtheindexentriesthatarestoredintheleafnodes,wecaneasilyidentifythedatapageforwhichwearelooking.Todothiseffi-ciently,wehavetokeepthetreesinmainmemory,similartothescalesofthegridfile.Forfurtherspeedup,theleafnodesofeachbinarytreearelinkedtoeachother.InFigure36thisissuggestedbythearrowsat-tachedtotheleavesofthetrees.Tohandleextendedobjects,weenlargethestoragerepresentationofeachslicebyalowerandanupperbound.Theseboundsindicatetheminimumandthemaximumextensionalongthecurrentdimensionofallobjectsstoredinthesliceathand.Insertionisstraightforwardandsimi-lartothegridfile.Toavoidambiguities,PLOP-hashingusesthecentroidoftheobjecttodeterminethedatabucketinwhichtoplacetheobject.Inthecaseofnodesplittinganddeletionwehavetoadjusttherespectiveupperandlowerbounds.ItshouldbefurthernotedthatPLOP-hashingcanbeeasilymodifiedsothatitsupportsclippingratherthanoverlappingregions.AnalyticalexperimentsindicatethatPLOP-hashingissuperiortotheR-treeandR-treeforuniformdatadistribu-tions[SeegerandKriegel1988].5.3ClippingClipping-basedschemesdonotallowanyoverlapsbetweenbucketregions;theyhavetobemutuallydisjoint.Atypicalaccessmethodofthiskindisthe-tree[Stonebrakeretal.1986;Sellisetal.1987],avariantoftheR-treethatallowsnooverlapbetweenregionscor-respondingtonodesatthesametreelevel.Asaresult,pointqueriesfollowasinglepathstartingattheroot,whichmeansefficientsearches.Themainproblemswithclipping-basedapproachesrelatetotheinsertionanddeletionofdataobjects.Duringin-sertion,anydataobjectthatspansmorethanonebucketregionhastobesubdi-videdalongthepartitioninghyper-planes.Eventually,severalbucketen-trieshavetobecreatedforthesameobject.Eachbucketstoreseitherthegeometricdescriptionofthecompleteobject(objectduplication)orthegeo-metricdescriptionofthecorrespondingfragmentwithanobjectreference.Inanycase,dataabouttheobjectaredis-persedamongseveraldatapages(ningproperty).Theresultingredun-dancy[Orenstein1989a,b;GuÈntherandGaede1997]maycausenotonlyanincreaseintheaveragesearchtime,but Figure35.212·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 alsoanincreaseinthefrequencyofbucketoverflows.Asecondproblemappliestoclipping-basedaccessmethodsthatdonotparti-tionthecompletedataspace.Inthatcase,theinsertionofanewdataobjectmayleadtotheenlargementofseveralbucketregions.Whenevertheobject(orafragmentthereof)ispasseddowntoabucket(or,inthecaseofatreestruc-ture,aninteriornode)whoseregiondoesnotcoverit,theregionhastobeextended.Insomecases,suchanen-largementisnotpossiblewithoutget-tinganoverlapwithotherbucketre-gions;thisissometimescalledtheproblemofclipping.Becauseoverlapisnotallowed,onehastocom-puteanewregionstructure,whichcanbecomeverycomplicated.Itmayinpar-ticularcausefurtherbucketoverflowsandinsertions,whichcanleadtoachainreactionand,intheworstcase,acompletebreakdownofthestructurestructureÈntherandBilmes1991].Accessmethodspartitioningthecompletedataspacedonotsufferfromthisproblem.Afinalproblemconcernsthesplittingofbuckets.Theremaybesituationswhereabucket(anditscorrespondingregion)hastobesplitbutthereisnosplittinghyperplanethatsplitsnone(oronlyafew)oftheobjectsinthatbucket.Thesplitmaythentriggerothersplits,whichmaybecomeproblematicwithin-creasingsizeofthedatabase.Themoreobjectsareinserted,thehighertheprobabilityofsplitsandthesmallertheaveragesizeofthebucketregions.Newobjectsarethereforesplitintoalargernumberofsmallerfragments,whichmayintheworstcaseonceagainleadtoachainreaction.Toalleviatetheseproblems,GuÈntherandNoltemeier[1991]suggeststoringlargeobjects(whicharemorelikelytobesplitintoalargenumberoffragments)inspecialbucketscalledoversizeshelves,insteadofinsertingthemintothestructure.TheExtendedk-d-Tree[Mat-suyamaetal.1984].Oneoftheearliestextensionsoftheadaptivek-d-treethatcouldhandleextendedobjectswasthe Figure36.MultidimensionalAccessMethods·213ACMComputingSurveys,Vol.30,No.2,June1998 extendedk-d-tree.Incontrasttotheskd-tree(Section5.2.5),theextendedk-d-treeisbasedonclipping.Eachinte-riortreenodecorrespondstoa(dimensionalpartitioninghyperplane,representedbythedimension(e.g.,)andthesplittingcoordinate(thedis-criminator).Aleafnodecorrespondstoarectangularsubspaceandcontainstheaddressofthedatapagedescribingthatsubspace.Datapagesmaybereferencedbymultipleleafnodes.Toinsertanobject,westartattherootofthek-d-tree.Ateachinteriornode,wetestforintersectionwiththestoredhyperplane.Dependingonthelo-cationoftheobjectrelativetothehyper-plane,weeithermoveontothecorre-spondingchildnode,orwecliptheobjectbythehyperplaneandfollowbothbranches.Thisprocedureguaran-teesthatweinserttheobjectinalloverlappingbucketregions.Ifadatapagecannotaccommodatetheaddi-tionalobject,wesplitthepagebyanewhyperplane.Thesplittingdimensionisperpendiculartothedimensionwiththegreatestextension.Afterdistributingtheentriesofthedatapageamongthetwonewpages,weinsertthehyper-planeintothek-d-tree.Notethatthismayinturncausesomeobjectstobesplit,whichmayleadtofurtherpageoverflows.Todeleteanobject,wehavetovisitallsubspacesintersectingtheobjectanddeletethestoredobjectiden-tifier.Ifadatapageisemptyduetodeletion,weremoveitandmarkallleafnodespointingtothatpageas.Nomergingofsiblingnodesisperformed.Figure37depictsanextendedk-d-treefortherunningexample.Rectanglem7hasbeenclippedandinsertedintotwonodes.Mostpartitionscontainoneortwoadditionalboundinghyperplanes(dottedlines)toprovideabetterlocal-izationoftheobjectsinthecorrespond-ingsubspace.TheR-Tree[Stonebrakeretal.1986;Sellisetal.1987].Toovercometheproblemsassociatedwithoverlap-pingregionsintheR-tree,Sellisetal.[1987]introducedanaccessmethodcalledtheR-tree.UnliketheR-tree,theR-treeusesclipping;thatis,thereisnooverlapbetweenindexintervalsatthesametreelevel.Objectsthatin-tersectmorethanoneindexintervalhavetobestoredonseveraldifferentpages.Asaresultofthispolicy,pointsearchesinR-treescorrespondtosin-gle-pathtreetraversalsfromtheroottooneoftheleaves.TheythereforetendtobefasterthanthecorrespondingR-treeoperation.Rangesearchesusuallyleadtothetraversalofmultiplepathsinbothstructures.Wheninsertinganewobject,wemayhavetofollowmultiplepaths,de-pendingonthenumberofintersectionsoftheMBB)withindexintervals.Duringthetreetraversal,)maybesplitintodisjointfragments)).Eachfragmentisthenplacedinadifferentleafnode Figure37.Extendedk-d-tree.214·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 Providedthatthereisenoughspace,theinsertionisstraightforward.Iftheboundinginterval)overlapsspacethathasnotyetbeencovered,wehavetoenlargetheintervalscorrespondingtooneormoreleafnodes.Eachoftheseenlargementsmayrequireconsiderableeffortbecauseoverlapsmustbeavoided.Insomerarecases,itmaynotbepossi-bletoincreasethecurrentintervalsinsuchawaythattheycoverthenewobjectwithoutsomemutualoverlapoverlapÈnther1988;Ooi1990].Incaseofsuchadeadlock,somedataintervalshavetobesplitandreinsertedintotheIfaleafnodeoverflowsithastobesplit.NodesplittingsworksimilarlytothecaseoftheR-tree.Animportantdifference,however,isthatsplitsmaypropagatenotonlyupthetree,butalsodownthetree.Theresultingforcedsplitofthenodesbelowmayleadtoseveralcomplications,includingfurtherfrag-mentationofthedataintervals;see,forexample,therectanglesm5andm8inFigure38.Fordeletion,wefirstlocateallthedatanodeswherefragmentsoftheob-jectarestoredandremovethem.Ifstor-ageutilizationdropsbelowagiventhreshold,wetrytomergetheaffectednodewithitssiblingsortoreorganizethetree.Thisisnotalwayspossible,whichisthereasonwhytheRcannotguaranteeaminimumspaceuti-TheCellTree[GuÈnther1988].Themaingoalinthedesignofthecelltree[GuÈnther1988;1989]wastofacili-tatesearchesondataobjectsofarbi-traryshapes,thatis,especiallyondataobjectsthatarenotintervalsthem-selves.Thecelltreeusesclippingtomanagelargespatialdatabasesthatmaycontainpolygonsorhigher-dimen-sionalpolyhedra.Itcorrespondstoadecompositionoftheuniverseintodis-jointconvexsubspaces.Theinteriornodescorrespondtoahierarchyofnestedpolytopesandeachleafnodecor-respondstooneofthesubspaces(Figure39).EachtreenodeisstoredononediskToavoidsomeofthedisadvantagesresultingfromclipping,theconvexpoly-hedraarerestrictedtobesubspacesofaBSP(binaryspacepartitioning).There-forewecanviewthecelltreeasacombinationofaBSP-andanRorasaBSP-treemappedonpagedsec-ondarymemory.Inordertominimizethenumberofdiskaccessesthatoccurduringasearchoperation,theleafnodesofacelltreecontainalltheinfor-mationrequiredforansweringagivensearchquery;weloadnopagesotherthanthosecontainingrelevantdata.ThisisanimportantadvantageofthecelltreeovertheR-treeandrelatedBeforeinsertinganonconvexobject,wedecomposeitintoanumberofcon-vexcomponentswhoseunionistheorig- Figure38.MultidimensionalAccessMethods·215ACMComputingSurveys,Vol.30,No.2,June1998 inalobject.Thecomponentsdonothavetobemutuallydisjoint.Allcomponentsareassignedthesameobjectidentifierandinsertedintothecelltreeonebyone.Duetoclipping,wemayhavetosubdivideeachcomponentintoseveralcellsduringinsertion,becauseitover-lapsmorethanonesubspace.Eachcellisstoredinoneleafnodeofthecelltree.Ifaninsertioncausesadiskpagetooverflow,wehavetosplitthecorre-spondingsubspaceandcelltreenodeanddistributeitsdescendantsbetweenthetworesultingnodes.Eachsplitmaypropagateupthetree.Forpointsearches,westartattherootofthetree.UsingtheunderlyingBSPpartitioning,weidentifythesub-spacethatincludesthesearchpointandcontinuethesearchinthecorrespond-ingsubtree.Thisstepisrepeatedrecur-sivelyuntilwereachaleafnode,whereweexamineallcellstoseewhethertheycontainthesearchpoint.Thesolutionconsistsofthoseobjectsthatcontainatleastoneofthecellsthatqualify.Asimilaralgorithmexistsforrangesearches.Aperformanceevaluationofthecelltree[GuÈntherandBilmes1991]showsthatitiscompetitivewithotherpopularspatialaccessmethods.Figure39showsourrunningexamplewithfivepartitioninghyperplanes,eachofthemstoredintheinteriornodes.EventhoughthepartitioningbymeansoftheBSP-treeoffersmoreflexibilitythanrectilinearhyperplanes,clippingobjectsmaybeinevitable.InFigure39,wehadtosplitr2andinserttheresult-ingcellsintotwopages.Asdoallstructuresbasedonclipping,thecelltreehastocopewiththefrag-mentationofspace,whichbecomesin-creasinglyproblematicasmoreobjectsareinsertedintothetree.Aftersometime,mostnewobjectswillbesplitintofragmentsduringinsertion.Toavoidthenegativeeffectsofthisfragmenta-tion,GuÈntherandNoltemeier[1991]proposedtheconceptofoversizeshelvesOversizeshelvesarespecialdiskpagesattachedtotheinteriornodesofthetreethataccommodateobjectswhichwouldhavebeensplitintotoomanyfragmentsiftheyhadbeeninsertedregularly.Theauthorsproposeadynamicallyadjust-ingthresholdforchoosingbetweenplac-inganewobjectonanoversizeshelforinsertingitregularly.Performancere-sultsofGuÈntherandGaede[1997]showsubstantialimprovementscomparedtothecelltreewithoutoversizeshelves.5.4MultipleLayersThemultiplelayertechniquecanbere-gardedasavariantoftheoverlappingregionsapproach,becausedataregionsofdifferentlayersmayoverlap.How-ever,thereareseveralimportantdiffer-ences.First,thelayersareorganizedinahierarchy.Second,eachlayerparti-tionsthecompleteuniverseinadiffer-entway.Third,dataregionswithina Figure39.Celltree.216·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 layeraredisjoint;thatis,theydonotoverlap.Fourth,thedataregionsdonotadapttothespatialextensionsofthecorrespondingdataobjects.Inordertogetabetterunderstandingofthemultilayertechnique,wediscusshowtoinsertanextendedobject.First,wetrytofindthelowestlayerinthehierarchywhosehyperplanesdonotsplitthenewobject.Ifthereissuchalayer,weinserttheobjectintothecor-respondingdatapage.Iftheinsertioncausesnopagetooverflow,wearedone.Otherwise,wemustsplitthedatare-gionbyintroducinganewhyperplaneanddistributetheentriesaccordingly.Objectsintersectingthehyperplanehavetobemovedtoahigherlayeroranoverflowpage.Asthedatabasebecomespopulated,thedataspaceofthelowerlayersbecomesmoreandmorefrag-mented.Asaresult,largeobjectskeepaccumulatingonhigherlayersofthehierarchyor,evenworse,itisnolongerpossibletoinsertobjectswithoutinter-sectingexistinghyperplanes.Themultilayerapproachseemstoof-feroneadvantageovertheoverlappingregionstechnique:apossiblyhigherse-lectivityduringsearchingduetotherestrictedoverlapofthedifferentlay-ers.However,therearealsoseveraldis-advantages:themultilayerapproachsuffersfromfragmentation,whichmayrenderthetechniqueinefficientforsomedatadistributions;certainqueriesrequiretheinspectionofallexistinglayers;itisnotclearhowtoclusterobjectsthatarespatiallyclosetoeachotherbutindifferentlayers;andthereissomeambiguityaboutthelayerinwhichtoplacetheobject.AnearlystaticmultilayeraccessmethodistheMX-CIFquadtreequadtree1982;AbelandSmith1983;Samet1990b].Thisstructurestoreseachex-tendedspatialobjectwiththequadtreenodewhoseassociatedquadrantpro-videsthetightestfitwithoutintersect-ingtheobject.Objectswithinanodeareorganizedbymeansofbinarytrees.SevcikandKoudas[1996]laterpro-posedasimilarSAMbasedonmultiplelayers,calledthefiltertree.AsintheMX-CIFquadtree,eachlayeristhere-sultofaregularsubdivisionoftheuni-verse.Anewobjectisassignedtoauniquelayer,dependingontheobject'spositionandextension.ObjectswithinonelayerarefirstsortedbytheHilbertcodeoftheircenter,thenpackedintodatapagesofagivensize.Finally,thelargestHilbertcodeofeachdatapage,togetherwithitsreference,isinsertedintoaB-tree.Wecontinuewithadetaileddescrip-tionoftwodynamicSAMsbasedonmultiplelayers.TheMultilayerGridFile[SixandWidmayer1988].Yetanothervari-antofthegridfilecapableofhandlingextendedobjectsisthemultilayergridfile(nottobeconfusedwiththemulti-levelgridfileofWhangandKrish-namurthy[1985]).Themultilayergridfileconsistsofanorderedsequenceofgridlayers.Eachoftheselayerscorre-spondstoaseparategridfile(withfreelypositionablesplittinghyper-planes)thatcoversthewholeuniverse.Anewobjectisinsertedintothefirstgridfileinthesequencethatdoesnotimplyanyclippingoftheobject.Thisisanimportantdifferencefromthetwingridfile(seeSection4.1.4),whereob-jectscanbemovedfreelybetweenthetwolayers.Ifoneofthegridfilesisextendedbyaddinganothersplittinghyperplane,thoseobjectsthatwouldbesplithavetobemovedtoanotherlayer.Figure40illustratesamultilayergridfilewithtwolayersfortherunningex-Inthemultilayergridfile,thesizeofthebucketregionstypicallyincreaseswithinthesequence;thatis,largerob-jectsaremorelikelytofindtheirfinallocationinlaterlayers.Ifanewobjectcannotbestoredinanyofthecurrentlayerswithoutclipping,anewlayerhastobeallocated.Analternativeistoal-lowclippingonlyforthelastlayer.SixandWidmayerclaimthat1layersaresufficienttostoreasetofMultidimensionalAccessMethods·217ACMComputingSurveys,Vol.30,No.2,June1998 sionalintervalswithoutclippingifthehyperplanesarecleverlychosen.Foranexactmatchquery,wecaneasilydeterminefromthescaleswhichgridfileinthesequenceissupposedtoholdthesearchinterval.Othersearchqueries,inparticularpointandrangequeries,areansweredbytraversingthesequenceoflayersandperformingacor-respondingsearchoneachgridfile.TheperformanceresultsreportedbySixandWidmayer[1988]suggestthatthemul-tilayergridfileissuperiortothecon-ventionalgridfile,usingclippingtohandleextendedobjects.Possibledisad-vantagesofthemultilayergridfilein-cludelowstorageutilizationandexpen-sivedirectorymaintenance.TheR-File[Hutfleszetal.Toovercomesomeoftheprob-lemsofthemultilayergridfile,Hutfleszetal.[1990]proposedanalternativestructureformanagingsetsofrectan-glescalledtheR-file;seeFigure41foranexample.Inordertoavoidthelowstorageutilizationofthemultilayergridfile,theR-fileusesasingledirectory.TheuniverseispartitionedsimilarlytotheBANGfile:splittinghyperplanescuttheuniverserecursivelyintoequalparts,andz-orderingisusedtoencodetheresultingbucketregions.IncontrasttotheBANGfile,however,therearenoexcisions.Bucketregionsmayoverlap,andthereisnoclipping.Eachdatain-tervalisstoredinthebucketwiththesmallestregionthatcontainsitentirely;overflowpagesmaybenecessaryinsomecases.AninterestingfeatureoftheR-fileisitssplittingalgorithm.Ratherthancut-tingabucketregionintotwohalves,weretaintheoriginalbucketregionandcreateanewbucketforoneofthetwohalvesofthatoriginalregion.Datain-tervalsarethenassignedtothenewbucketifandonlyiftheyarecompletelycontainedinthecorrespondingregion.Thehalfischoseninsuchawaythatthedistributionofdataintervalsbe-tweenthetworesultingbucketsismosteven.Oncearegionhasbeensplit,itmaysubsequentlybesplitagain,usingthesamealgorithm.Sinceobjectsthatarelocatednearthemiddleoftheuni-versearelikelytointersecttheparti-tioninghyperplanes,theyareoftenas-signedtothecellregioncorrespondingtothewholeuniverse.Thusobjectsinthatcelltendtoclusternearthesplit-tinghyperplanes(cf.rectangler5inFig-ure41).Toavoidsearchingdeadspace,theR-filemaintainsminimumenclosingboxesofthestoredobjects,called.AsshownbyHutfleszetal.[1990],thisfeature,togetherwiththez-encodingofthepartitions,maketheR-filecompetitivetotheR-tree.OnedrawbackoftheR-fileisthefactthatitpartitionstheentirespace,whereastheR-treeindexesonlythepartoftheuni-versethatcontainsobjects.Fordatadistributionsthatarenonuniform,the Figure40.Multilayergridfile.218·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 R-filethereforeoftenperformspoorly.ThisdisadvantageissomethingthattheR-fileshareswiththegridfile.Wid-mayer[1991]alsonotesthattheR-fileisªalgorithmicallycomplicated.º6.COMPARATIVESTUDIESInthissection,wegiveabriefoverviewoftheoreticalandexperimentalresultsonthecomparisonofdifferentaccessmethods.Unfortunately,thenumberofsuchevaluations,especiallytheoreticalanalyses,isratherlimited.Greene[1989]comparesthesearchperformanceoftheR-tree,thek-d-B-tree,andtheR-treefor10,000uni-formlydistributedrectanglesofvaryingsize.Queryparametersincludethesizeofthequeryrectanglesandthepagesize.Greene'sstudyshowsthatthek-d-B-treecanneverreallycompetewiththetwoR-treevariants.Ontheotherhand,thereisnotmuchdifferencebe-tweentheR-treeandtheR-tree,eventhoughtheformerissignificantlymoredifficulttocode.Asexpected,theRtreeperformsbetterwhenthereislessoverlapbetweenthedatarectangles.Kriegeletal.[1990]presentanexten-siveexperimentalstudyofaccess-methodperformanceforavarietyofpointdistributions.Thestudyinvolvesfourpointaccessmethods:thehB-tree,theBANGfile,thetwo-levelgridfile,andthebuddytree.TheauthorsdecidednottoincludePLOP-hashingsinceitsperformancesuffersconsiderablyfornonuniformdata.ThezkdB-treeofOrensteinandMerrett[1984]wasalsonotincludedsincetheauthorsconsid-eredboththeBANGfileandthehB-treeasimprovementsofthatstrategy.Finally,Kriegeletal.didnotincludequantilehashingalthoughtheyclaim[KriegelandSeeger1987,1989]thatthisstructureisveryefficientfornon-uniformdata.Accordingtothebenchmarks,thebuddytreeand,tosomedegree,theBANGfileoutperformallotherstruc-tures.Thereportedresultsshowinanimpressivewayhowtheperformanceof Figure41.MultidimensionalAccessMethods·219ACMComputingSurveys,Vol.30,No.2,June1998 theaccessmethodsstudiedvarieswithdifferentdatadistributionsandqueryrangesizes.Forclustereddataandaqueryrangeofsize10%ofthesizeoftheuniverse,thereisalmostnoperfor-mancedifferencebetweenthebuddytreeandtheBANGfile.Ifthesizeofthequeryrangedropstoonly0.1%ofthesizeoftheuniverse;however,thebuddytreeperformsabouttwiceasfast.Forextendedobjects,Kriegeletal.[1990]comparedtheR-treeandPLOP-hashingwiththebuddytreeandtheBANGfile.Thelattertwotechniqueswereenhancedbythetransformationtechniquetohandlerectangles.Onceagain,thebuddytreeandtheBANGfileoutperformedtheothertwoaccessmethodsfornearlyalldatadistribu-tions.Notethatthebenchmarksmea-suredonlythenumberofpageaccessesbutnottheCPUtime.Beckmannetal.[1990]comparedtheR*-treewithseveralvariantsoftheR-treeforavarietyofdatadistributions.Besidestheperformanceofthedifferentstructuresforpoint,intersection,andenclosurequeriesforvaryingqueryre-gionsizes,theyalsocomparedspatialjoinperformance.TheR*-treeistheclearwinnerforalldatadistributionsandqueries,anditalsohasthebeststorageutilizationandinsertiontimes.Acomparisonforpointdataconfirmstheseresults.Similarlytopreviousper-formancemeasurements,onlythenum-berofdiskaccessesismeasured.Are-latedstudybyKamelandFaloutsos[1994]findsevenbettersearchresultsfortheHilbertR-tree,whereasupdatestakeaboutthesametimeasfortheR*-tree.TheimpactofglobalclusteringonthesearchperformanceoftheR*-treewasinvestigatedbyBrinkhoffandKriegel[1994].Kameletal.[1996]useHilbertcodesforbulkinsertionintody-namicR*-trees.Seeger[1991]studiedtherelativeperformanceofclipping,overlappingre-gions,andtransformationtechniquesimplementedontopofthebuddytree.Healsoincludedthetwo-levelgridfileandtheR*-treeinthecomparison.Thebuddytreewithclippingandthegridfilefailedcompletelyforcertaindistri-butions,sincetheyproducedunmanage-ablylargefiles.Thetransformationtechniquesupportsfastinsertionsattheexpenseoflowstorageutilization.TheR*-tree,ontheotherhand,requiresfairlylonginsertiontimesbutoffersgoodstorageutilization.Forintersec-tionandcontainmentqueries,thebuddytreecombinedwithoverlappingregionsiscontinuouslysuperiortothebuddytreewithtransformation.Theperformanceadvantageoftheoverlap-pingregionstechniquedecreasesforlargerqueryregions,eventhoughthebuddytreewithtransformationneveroutperformsthebuddytreewithover-lappingregions.Whenthedatasetcon-tainsuniformlydistributedrectanglesofvaryingsize,thebuddytreewithclippingoutperformstheothertech-niquesforintersectionandenclosurequeries.ForsomequeriesthebuddytreewithoverlappingperformsslightlybetterthantheR*-tree.Ooi[1990]comparesastaticandadynamicvariantoftheskd-treewiththepackedR-treedescribedbyRousso-poulosandLeifker[1985].Forlargepagesizes,theskd-treeclearlyoutper-formstheR-treeintermsofpageac-cessespersearchoperation.Thespacerequirementsoftheskd-tree,however,arehigherthanthoseoftheR-tree.Sincetheskd-treestorestheextendedobjectsbytheircentroids,containmentqueriesareansweredmoreefficientlythanbytheR-tree.Thisbehaviorisclearlyreflectedintheperformancere-sults.Acomparisonwiththeextendedk-d-tree,enhancedbyoverflowpages,suggeststhattheskd-treeissuperior,althoughtheextendedk-d-tree(whichisbasedonclipping)performsratherwellforuniformlydistributeddata.ÈntherandBilmes[1991]comparetheR-treetotwoclipping-basedaccessmethods,thecelltreeandtheRUnlikemoststudies,thedatasetscon-sistofconvexpolygonsinsteadofjustrectangles.Thecelltreerequiresuptotwiceasmuchspaceasitscompetitors.220·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 However,theaveragenumberofpageaccessespersearchoperationislessthanfortheothertwoaccessmethods.Moreover,thisadvantagetendstoin-creasewiththesizeofthedatabaseandthesizeofthequeryregions.Besidesmeasurementsonthenumberofpagefaults,CPUtimemeasurementsarealsogiven.ÈntherandGaede[1997]comparetheoriginalcelltreeaspresentedbyÈnther[1989]withthecelltreewithoversizeshelves[GuÈntherandNolte-meier1991],theR*-tree[Beckmannetal.1990],andthehB-tree[LometandSalzberg1989]forsomerealcarto-graphicdata.Thereisaslightperfor-manceadvantageofthecelltreewithoversizeshelvescomparedtotheR*-treeandthehB-tree,butamajordiffer-encefromtheoriginalcelltree.Anear-liercomparisonusingartificiallygenerateddatacanbefoundinGuÈnther[1991].Bothstudiessuggestthatover-sizeshelvesmayleadtosignificantim-provementsforaccessmethodswithOosterom[1990]comparesthequerytimesofhisKD2B-treeandthespheretreewiththeR-treefordifferentque-ries.TheKD2B-treeisapagedversionoftheKD2-tree,whichinturnisavariantoftheBSP-tree.Thetwostruc-turesdifferintwoaspects:eachinteriornodestorestwoiso-orientedlinestoal-lowforoverlapandgaps,andthecorre-spondingpartitionlinesdonotclip;thatis,anobjectishandledasaunit.TheKD2B-treeoutperformstheR-treeforallqueries,whereasthespheretreeisinferiortotheR-tree.HoelandSamet[1992]comparetheperformanceofthePMR-quadtree[Nel-sonandSamet1987],theR*-tree,andtheR-treeforindexinglinesegments.TheR-treeshowsthebestinsertionperformance,whereastheR*-treeoccu-piestheleastspace.However,theinser-tionbehavioroftheR-treeheavilyde-pendsonthepagesize,unlikethePMR-quadtree.Theperformanceofallstructurescomparedisaboutthesame,eventhoughthePMR-quadtreeshowssomeslightperformancebenefits.Al-thoughtheR*-treeismorecompactthantheotherstructures,itssearchperformanceisnotasgoodasthatoftheR-treeforlinesegments.Unfortu-nately,HoelandSametdonotreporttheoverallperformancetimesforthedifferentqueries.Pelouxetal.[1994]carriedoutasim-ilarperformancecomparisonoftwoquadtreevariants,avariantoftheRtree,andtheR*-tree.Whatmakestheirstudydifferentisthatallstructureshavebeenimplementedontopofacom-mercialobject-orientedsystemusingtheapplicationprogrammerinterface.AfurtherdifferencetoHoelandSamet[1992]isthatPelouxetal.usedpoly-gonsratherthanlinesegmentsastestdata.Furthermore,theyreportthevar-ioustimesforindextraversal,loadingpolygons,andthelike.BesidesshowingthattheR-treeandaquadtreevariantbasedonHierarchicalEXCELL[Tam-minen1983]outperformtheR*-treeforpointqueries,theyclearlydemonstratethatthedatabasesystemmustprovidesomemeansforphysicalclustering.Otherwise,readingasingleindexpagemayinduceseveralpagefaults.SmithandGao[1990]comparetheperformanceofavariantofthezkdBtree,thegridfile,theR-tree,andthe-treeforinsertions,deletions,andsearchoperations.Theyalsomeasuredstorageutilization.Theconclusionoftheirexperimentsisthatz-orderingandthegridfileperformwellforinsertionsanddeletionsbutdeliverpoorsearchperformance.R-andR-trees,incon-trast,offermoderateinsertionanddele-tionperformancebutsuperiorsearchperformance.AlthoughtheR-treeper-formsslightlybetterthantheR-treeforsearchoperations,theauthorsconcludethattheR-treeisnotagoodchoiceforgeneral-purposeapplications,duetoitspotentiallypoorspaceutilization.Hutfleszetal.[1990]showedthattheR-filehasa10to20%performancead-vantageovertheR-treeonadatasetcontaining48,000rectangleswithahighdegreeofoverlap(eachpointintheMultidimensionalAccessMethods·221ACMComputingSurveys,Vol.30,No.2,June1998 databasewascoveredby5.78rectanglesontheaverage).FurtherexperimentalstudiesontheR-treeandrelatedstructurescanbefoundinFrankandBarrera[1989],Ka-melandFaloutsos[1992],andKolovsonandStonebraker[1991].Sincethesplittingofdatabucketsisanimportantoperationinmanystruc-tures,HenrichandSix[1991]studiedseveralsplitstrategies.Theirtheoreti-calanalysisisverifiedbymeansoftheLSD-tree.Theyalsoprovidesomeper-formanceresultsfortheR-treethatusestheirsplittingstrategyincompari-sontotheotherwiseunchangedR-tree.AnempiricalperformancecomparisonoftheR-treewithanimprovedvariantofz-hashing,calledlayeredz-hashingorlz-hashing[Hutfleszetal.1988a],canbefoundinHutfleszetal.[1991].TheproposedstructureneedssignificantlylessseekoperationsthantheR-tree;averagestorageutilizationishigher.Jagadish[1990a]studiestheproper-tiesofdifferentspace-fillingcurves(z-ordering,Gray-coding,andHilbert-curve).Bymeansoftheoreticalconsiderationsaswellasbyexperimen-taltests,heconcludesthattheHilbertmappingfrommultidimensionalspacetoalineissuperiortootherspace-fillingcurves.Theseresultsareinac-cordancewiththoseofAbelandMarkMarkWhentryingtosummarizeallthoseexperimentalcomparisons,thefollow-ingmultidimensionalaccessmethodsseemtobeamongthebest-performingones(inalphabeticalorder):Ðbuddy(hash)tree[SeegerandKriegelÐcelltreewithoversizeshelvesshelvesÈntherandGaede1997],ÐHilbertR-tree[KamelandFaloutsosÐKD2B-tree[Oosterom1990],ÐPMR-quadtree[NelsonandSamet-tree[Sellisetal.1987],andÐR*-tree[Beckmannetal.1990]Itcannotbeemphasizedenough,how-ever,thatanysuchrankingneedstobeusedwithgreatcare.Cleverprogram-mingcanoftenmakeupforinherentdeficienciesofanaccessmethodandviceversa.Otherfactorsofunpredict-ableimpactincludethehardwareused,thebuffersize/pagesize,andthedatasets.Notealsothatourlistdoesnottakeintoaccountaccessmethodsforwhichnocomparativeanalyseshavebeenpublished.Astheprecedingdiscussionshows,althoughnumerousexperimentalstud-iesexist,theyarehardlycomparable.Theoreticalstudiesmaybringsomemoreobjectivitytothisdiscussion.Theproblemwithsuchstudiesisthattheyareusuallyveryhardtoperformifonewantstosticktorealisticmodelingas-sumptions.ForthatreasonthereareonlyafewtheoreticalresultsonthecomparisonofmultidimensionalaccessRegnier[1985]andBecker[1992]in-vestigatedthegridfileandsomeofitsvariants.ThemostcompletetheoreticalanalysisofrangetreescanbefoundinOvermarsetal.[1990]andSmidandOvermars[1990].GuÈntherandGaede[1997]presentatheoreticalanalysisofthecelltree.Recentanalysesshowthatthetheoryoffractalsseemstobepartic-ularlysuitableformodelingthebehav-iorofSAMsifthedatadistributionisSomemoreanalyticalworkexistsontheR-treeandrelatedmethods.Acom-parisonoftheR-treeandtheRhasbeenpublishedbyFaloutsosetal.[1987].Recently,Pageletal.[1993a]presentedaninterestingprobabilisticmodelofwindowqueryperformanceforthecomparisonofdifferentaccessmeth-odsindependentofimplementationde-tails.Amongotherthings,theirmodelrevealstheimportanceoftheperimeterasacriterionfornodesplitting,whichhasbeenintuitivelyanticipatedbythe SeeFaloutsosandKamel[1994],BelussiandFaloutsos[1995],FaloutsosandGaede[1996],andPapadopoulosandManolopoulos[1997].222·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 inventorsoftheR*-tree[Beckmannetal.1990].ThecentralformulaofPageletal.[1993]tocomputethenumberofdiskaccessesinanR-treehasbeenfoundindependentlybyKamelandFa-loutsos[1993].FaloutsosandKamel[1994]laterrefinedthisformulabyus-ingpropertiesofthedataset.Morerecently,TheodoridisandSellis[1996]proposedatheoreticalmodeltodeter-minethenumberofdiskaccessesinanR-treethatrequiresonlytwoparame-ters:theamountofdataandthedensityinthedataspace.Theirmodelalsoex-tendstononuniformdistributions.Inpursuitofanimplementation-inde-pendentcomparisoncriterionforaccessmethods,Pageletal.[1995]suggestus-ingthedegreeofclustering.Asalowerboundtheyassumetheoptimalcluster-ingofthestaticsituation,thatis,ifthecompletedatasethasbeenexposedbe-forehand.Incidentally,thesignificanceofclusteringforaccessmethodshasbeendemonstratedinnumerousempir-icalinvestigationsaswell.Intheareaofconstraintdatabasesystems(seeGaedeandWallace[1997]forarecentsurvey)anumberofinter-estingpapersrelatedtomultidimen-sionalaccessmethodshavebeenpub-lished.Kanellakisetal.[1993],forexample,presentedasemidynamicstruc-turethatguaranteescertainworst-caseboundsforspace,search,andinsertion.SubramanianandRamaswamy[1995]andHellersteinetal.[1997]comple-mentthisworkbyprovingsomeimpor-tantlowerandupperbounds.Sexton[1997]andStuckey[1997]lookatindex-ingfromalanguagepointofview.Theirworkcanberegardedasageneraliza-tionofworkbyHellersteinetal.[1995],whoproposedagenericframeworkformodelinghierarchicalaccessmethods.7.CONCLUSIONSResearchinspatialdatabasesystemshasresultedinamultitudeofspatialaccessmethods.Evenforexpertsitbe-comesmoreandmoredifficulttorecog-nizetheirmeritsandweaknesses,sinceeverynewmethodseemstoclaimsupe-rioritytoatleastoneaccessmethodpreviouslypublished.Thissurveydidnottrytoresolvethisproblembutrathertogiveanoverviewoftheprosandconsofavarietyofstructures.Itwillcomeasnosurprisetothereaderthatatpresentnoaccessmethodhasprovenitselfsuperiortoallitscompeti-torsinwhateversense.Evenifonebenchmarkdeclaresonestructureastheclearwinner,anotherbenchmarkmayprovethesamestructureinferior.Butwhyaresuchcomparisonssodif-ficult?Becausetherearesomanydiffer-entcriteriatodefineoptimality,andsomanyparametersthatdetermineper-formance.Bothtimeandspaceeffi-ciencyofanaccessmethodstronglyde-pendonthedataprocessedandthequeriesasked.Anaccessmethodthatperformsreasonablywellforiso-ori-entedrectanglesmayfailforarbitrarilyorientedlines.Stronglycorrelateddatamayrenderanotherwisefastaccessmethodirrelevantforanypracticalap-plication.Anindexthathasbeenopti-mizedforpointqueriesmaybehighlyinefficientforarbitraryregionqueries.Largenumbersofinsertionsanddele-tionsmaydeteriorateastructurethatisefficientinamorestaticenvironment.TheinitiativeofKriegeletal.[1990]tosetupastandardizedtestbedforbenchmarkingandcomparingaccessmethodsunderdifferentconditionswasanimportantstepintherightdirection.Theworldwidewebprovidesaconve-nientinfrastructuretoaccessanddis-tributesuchbenchmarks[GuÈntheretal.1988].Nevertheless,itremainsfarfromeasytocompareorrankdifferentaccessmethods.Experimentalbench-marksneedtobestudiedwithcareandcanonlybeafirstindicatorforusabil-Whenitcomestotechnologytransfer,thatis,totheuseofaccessmethodsincommercialproducts,mostvendorsre-sorttostructuresthatareeasytoun- SeeJagadish[1990a],KamelandFaloutsos[1993],BrinkhoffandKriegel[1994],Kumar[1994b],andNgandHan[1994].MultidimensionalAccessMethods·223ACMComputingSurveys,Vol.30,No.2,June1998 derstandandimplement.QuadtreesinSICAD[SiemensNixdorfInformations-systemeAG1997]andSmallworldGIS[NewellandDoe1997],R-treesinInfor-mix[InformixInc.1997],andZ-orderinginOracle[OracleInc.1995]aretypicalexamples.Performanceseemstobeofminorimportanceintheselection,whichcomesasnosurprisegiventherelativelysmalldifferencesamongmethodsinvirtuallyallpublishedanal-yses.Rather,thetendencyistotakeastructurethatissimpleandrobustandtooptimizeitsperformancebyahighlytunedimplementationandtightinte-grationwithothersystemcomponents.Nevertheless,theimplementationandexperimentalevaluationofaccessmethodsisessentialasitoftenrevealsdeficienciesandproblemsthatarenotobviousfromthedesignoratheoreticalmodel.Inordertomakesuchcompara-tiveevaluationsbotheasiertoperformandeasiertoverify,itisessentialtoprovideplatform-independentaccesstotheimplementationsofabroadvarietyofaccessmethods.SomeextensionsoftheWorldWideWeb,includingourownMMMproject[GuÈntheretal.1997],mayprovidetherighttechnologicalbaseforsuchaparadigmchange.OnceeverypublishedpaperincludesaURL(uniformresourcelocator),thatis,anInternetaddressthatpointstoanim-plementation,possiblywithastandard-izeduserinterface,transparencywillincreasesubstantially.Untilthen,mostuserswillhavetorelyongeneralwis-domandtheirownexperimentstose-lectanaccessmethodthatprovidesthebestfitfortheircurrentapplication.Whileworkingonthissurvey,wehadtheplea-sureofdiscussionswithmanycolleagues.SpecialthanksgotoD.Abel,A.Buchmann,C.Faloutsos,A.Frank,M.Freeston,J.C.Freytag,J.Heller-stein,C.Kolovson,H.-P.Kriegel,J.Nievergelt,J.Orenstein,P.Picouet,W.-F.Riekert,D.Rotem,J.-M.Saglio,B.Salzberg,H.Samet,M.Schiwietz,R.Schneider,M.Scholl,B.Seeger,T.Sellis,A.P.Sexton,andP.Widmayer.Wewouldalsoliketothanktherefereesfortheirdetailedandinsight-fulcomments.,D.J.,D.M.1990.Acompara-tiveanalysisofsometwo-dimensionalorder-Int.J.Geograph.Inf.Syst.4,1,21±31.,D.J.,J.L.1983.Adatastruc-tureandalgorithmbasedonalinearkeyforarectangleretrievalproblem.Comput.Vis.24,,C.,T.1997.NewlinearnodesplittingalgorithmforR-trees.InAdvancesinSpatialDatabases,M.SchollandA.Voisard,Eds.,LNCS,Springer-Verlag,Berlin/Heidel-berg/NewYork.,W.G.,H.1994.Thespatialfilterrevisited.InProceedingsoftheSixthInternationalSymposiumonSpatialData,190±208.,R.1996.TheuniversalB-treeformulti-dimensionalindexing.Tech.Rep.I9639,TechnischeUniversitaÈtMuÈnchen,Munich,Germany.http://www.leo.org/pub/comp/doc/,R.,E.M.1972.Organi-zationandmaintenanceoflargeorderedindi-ActaInf.1,3,173±189.,R.,M.1977.Concur-rencyofoperationsonB-trees.ActaInf.9,,B.,F,P.,G,S.,OT.,T,F.,,P.1992.En-closingmanyboxesbyanoptimalpairofboxes.InProceedingsofSTACS'92,A.FinkelandM.Jantzen,Eds.,LNCS525,Springer-Verlag,Berlin/Heidelberg/NewYork,475±486.,L.1992.Anewalgorithmandacostmodelforjoinprocessingwiththegridfile.Ph.D.thesis,UniversitaÈt-GesamthochschuleSiegen,Germany.,N.,K,H.-P.,S,R.,,B.1990.TheR*-tree:Anefficientandrobustaccessmethodforpointsandrect-angles.InProceedingsofACMSIGMODIn-ternationalConferenceonManagementof,322±331.,A.,C.1995.Esti-matingtheselectivityofspatialqueriesusingthe`correlation'fractaldimension.Iningsofthe21stInternationalConferenceonVeryLargeDataBases,299±310.,J.L.1975.Multidimensionalbinarysearchtreesusedforassociativesearching.Commun.ACM18,9,509±517.,J.L.1979.Multidimensionalbinarysearchindatabaseapplications.IEEETrans.Softw.Eng.4,5,333±340.,J.L.,J.H.1979.Data224·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 structuresforrangesearching.ACMComput.Surv.11,4,397±409.,S.,K,D.,,H.-P.1996.TheX-tree:Anindexstructureforhigh-dimensionaldata.InProceedingsofthe22ndInternationalConferenceonVeryLargeDataBases,(Bombay)28±39.,H.,I,A.,M,P.,ANDVANDEN,B.1990.Thegeneralizedgridfile:Descriptionandperformanceaspects.InceedingsoftheSixthIEEEInternationalCon-ferenceonDataEngineering,380±388.,T.1994.Derspatialjoiningeo-datenbanksystemen.Ph.D.Thesis,Ludwig-ÈtMuÈnchen.Germany(inGerman).,T.,H.-P.1994.Theimpactofglobalclusteringonspatialdata-basesystems.InProceedingsoftheTwentiethInternationalConferenceonVeryLargeData,168±179.,T.,K,H.-P.,,R.1993a.Comparisonofapproximationsofcomplexobjectsusedforapproximation-basedqueryprocessinginspatialdatabasesystems.ProceedingsoftheNinthIEEEInterna-tionalConferenceonDataEngineering,40±,T.,K,H.-P.,,B.1993b.EfficientprocessingofspatialjoinsusingR-trees.InProceedingsofACMSIG-MODInternationalConferenceonManage-mentofData,237±246.,T.,K,H.-P.,S,R.,,B.1994.Multi-stepprocessingofspatialjoins.InProceedingsoftheACMSIG-MODInternationalConferenceonManage-mentofData,197±208.,A.,L,C.,L,J.-L.,M.J.1995.Separabilityofpolyhedraforoptimalfilteringofspatialandconstraintdata.InProceedingsoftheFourteenthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSystems(SanJose,CA),54±64.,W.1984.Indexmaintenancefornon-uniformrecorddistributions.IningsoftheThirdACMSIGACT±SIGMODSymposiumonPrinciplesofDatabaseSys-,173±180.,W.A.1983.Interpolation-basedin-dexmaintenance.BIT23,274±294.,L.,D,R.,K,M.,L,S.,,D.,,A.1995.Accesstomultidimensionaldatasetsontertiarystor-agesystems.Inf.Syst.20,2,155±183.,D.1979.TheubiquitousB-tree.Comput.Surv.11,2,121±138.,S.P.,P.G.1986.AlgorithmsforBD-trees.Softw.Pract.Exper.2,1077±1096.,S.P.,P.G.1991.Improvedpartial-matchsearchalgorithmsforComput.J.34,5,415±422.,M.1989.Spatialquerylanguages.Ph.D.Thesis,UniversityofMaine,Orono,,M.1994.SpatialSQL:Aqueryandpresentationlanguage.IEEETrans.Knowl.DataEng.6,1,86±95.,G.1994.ThehB-tree:Aconcur-rentandrecoverablemulti-attributeindexstructure.Ph.D.Thesis,NortheasternUniver-sity,Boston,MA.,G.,L,D.,,B.1995.ThehB-tree:AmodifiedhB-treesup-portingconcurrency,recoveryandnodecon-solidation.InProceedingsofthe21stInterna-tionalConferenceonVeryLargeDataBases,R.,N,J.,P,N.,,R.1979.Extendiblehashing:Afastaccessmethodfordynamicfiles.Trans.DatabaseSyst.4,3,315±344.,C.1986.Multiattributehashingus-ingGray-codes.InProceedingsoftheACMSIGMODInternationalConferenceonMan-agementofData,227±238.,C.1988.Gray-codesforpartialmatchandrangequeries.IEEETrans.Softw.Eng.14,,C.,V.1996.Analysisof-dimensionalquadtreesusingtheHausdorfffractaldimension.InProceedingsofthe22ndInternationalConferenceonVeryLargeData,(Bombay),40±50.,C.,I.1994.Beyonduni-formityandindependence:AnalysisofR-treesusingtheconceptoffractaldimension.InProceedingsoftheThirteenthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSystems,4±13.,C.,Y.1991.DOT:Aspa-tialaccessmethodusingfractals.IningsoftheSeventhIEEEInternationalCon-ferenceonDataEngineering,152±159.,C.,S.1989.Fractalsforsecondarykeyretrieval.InProceedingsoftheEighthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSys-,247±252.,C.,S,T.,N.1987.Analysisofobject-orientedspatialaccessmethods.InProceedingsoftheACMSIGMODInternationalConferenceonMan-agementofData,426±439.,R.,J.L.1974.Quadtrees:Adatastructureforretrievalofcom-positekeys.ActaInf.4,1,1±9.,P.1983.Ontheperformanceevalua-tionofextendiblehashingandtriesearching.ActaInf.20,MultidimensionalAccessMethods·225ACMComputingSurveys,Vol.30,No.2,June1998 ,A.,R.1989.Thefieldtree:Adatastructureforgeographicinformationsystems.InDesignandImplementationofLargeSpatialDatabaseSystems,A.Buch-mann,O.GuÈnther,T.R.Smith,andY.-F.Wang,Eds.,LNCS409,Springer-Verlag,Ber-lin/Heidelberg/NewYork,29±44.,M.1987.TheBANGfile:Anewkindofgridfile.InProceedingsoftheACMSIG-MODInternationalConferenceonManage-mentofData,,260±269.,M.1989a.AdvancesinthedesignoftheBANGfile.InProceedingsoftheThirdInternationalConferenceonFoundationsofDataOrganizationandAlgorithms,LNCS367,Springer-Verlag,Berlin/Heidelberg/NewYork,322±338.,M.1989b.Awell-behavedstructureforthestorageofgeometricobjects.InandImplementationofLargeSpatialData-baseSystems,A.Buchmann,O.GuÈnther,T.R.Smith,andY.-F.Wang,Eds.,LNCS409,Springer-Verlag,Berlin/Heidelberg/NewYork,,M.1995.Ageneralsolutionofthe-dimensionalB-treeproblem.InoftheACMSIGMODInternationalConfer-enceonManagementofData,80±91.,M.1997.OnthecomplexityofBV-treeupdates.InProceedingsofCDB'97andCP'96WorkshoponConstraintDatabasesandtheirApplication,V.Gaede,A.Brodsky,O.Ènther,D.Srivastava,V.Vianu,andM.Wallace,Eds.,LNCS1191,Springer-Verlag,Berlin/Heidelberg/NewYork,282±293.,H.,A,G.D.,,E.D.1983.Nearreal-timeshadeddisplayofrigidComputerGraph.17,3,65±72.,H.,K,Z.,,B.1980.OnvisiblesurfacegenerationbyaprioritreeComputerGraph.14,,V.1995a.Geometricinformationmakesspatialqueryprocessingmoreefficient.InProceedingsoftheThirdACMInternationalWorkshoponAdvancesinGeographicInfor-mationSystems(ACM-GIS'95)MD)45±52.,V.1995b.Optimalredundancyinspa-tialdatabasesystems.InAdvancesinSpatial,M.J.EgenhoferandJ.R.Herring,Eds.,LNCS951,Springer-Verlag,Berlin/Hei-delberg/NewYork,96±116.,V.,W.-F.1994.Spatialac-cessmethodsandqueryprocessingintheobject-orientedGISGODOT.InoftheAGDM'94Workshop(Delft,TheNether-lands),NetherlandsGeodeticCommission,40±52.,V.,M.1997.Aninformalintroductiontoconstraintdatabases.InceedingsofCDB'97andCP'96WorkshoponConstraintDatabasesandtheirApplicationV.Gaede,A.Brodsky,O.GuÈnther,D.Srivas-tava,V.Vianu,andM.Wallace,Eds.,LNCS1191,Springer-Verlag,Berlin/Heidelberg/NewYork,7±52.,A.K.,C.C.1986.Order-preservingkeytransformation.ACMTrans.DatabaseSyst.11,2,213±234.,D.1989.Animplementationandper-formanceanalysisofspatialdataaccessmethods.InProceedingsoftheFifthIEEEInternationalConferenceonDataEngineer-,606±615.,O.1988.EfficientStructuresforGeo-metricDataManagement.LNCS337,Spring-er-Verlag,Berlin/Heidelberg/NewYork.,O.1989.Thecelltree:Anobject-ori-entedindexstructureforgeometricdata-bases.InProceedingsoftheFifthIEEEInter-nationalConferenceonDataEngineering598±605.,O.1991.Evaluationofspatialaccessmethodswithoversizeshelves.InDatabaseManagementSystems,G.Gambosi,M.Scholl,andH.-W.Six,Eds.,Springer-Ver-lag,Berlin/Heidelberg/NewYork,177±193.,O.1993.Efficientcomputationofspatialjoins.InProceedingsoftheNinthIEEEInternationalConferenceonDataEngi-,50±59.,O.,J.1991.Tree-basedaccessmethodsforspatialdatabases:Imple-mentationandperformanceevaluation.Trans.Knowl.DataEng.3,3,342±356.,O.,A.1990.Researchissuesinspatialdatabases.SIGMODRec.19,4,61±68.,O.,V.1997.Oversizeshelves:Astoragemanagementtechniqueforlargespatialdataobjects.Int.J.Geog.Inf.Syst.11,1,5±32.,O.,H.1991.Spatialdatabaseindicesforlargeextendedobjects.InProceedingsoftheSeventhIEEEInterna-tionalConferenceonDataEngineering,520±,O.,M,R.,S,P.,B,R.1997.MMM:AWWW-basedapproachforsharingstatisticalsoftwaremodules.IEEEInternetComput.1,,O.,O,V.,P,P.,S,J.-M.,,M.1998.Benchmarkingspa-tialjoinsaÁlacarte.InProceedingsofthe10thInternationalConferenceonScientificandStatisticalDatabaseManagement.IEEE,New,R.H.1989.Gral:Anextendiblerela-tionaldatabasesystemforgeometricapplica-tions.InProceedingsoftheFifteenthInterna-tionalConferenceonVeryLargeDataBases33±44.,R.H.,M.1993.226·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 Realms:Afoundationforspatialdatatypesindatabasesystems.InAdvancesinSpatialDa-,D.AbelandB.C.Ooi,Eds.,LNCS692,Springer-Verlag,Berlin/Heidelberg/New,A.1984.R-trees:Adynamicindexstructureforspatialsearching.InoftheACMSIGMODInternationalConfer-enceonManagementofData,47±54.,J.M.,K,E.,,C.H.1997.Towardsatheoryofindexability.InProceedingsoftheSixteenthACMSIGACT±SIGMOD±SIGARTSympo-siumonPrinciplesofDatabaseSystems,J.M.,N,J.F.,,A.1995.Generalizedsearchtreesfordatabasesystems.InProceedingsofthe21stInternationalConferenceonVeryLargeData,562±573.,A.1995.Adaptingthetransformationtechniquetomaintainmultidimensionalnon-pointobjectsink-d-treebasedaccessstruc-tures.InProceedingsoftheThirdACMInterna-tionalWorkshoponAdvancesinGeographicInformationSystems(ACM-GIS'95)more,MD)ACMPress,NewYork.,A.,J.1995.Extendingaspatialaccessstructuretosupportadditionalstandardattributes.InAdvancesinSpatial,M.J.EgenhoferandJ.R.Herring,Eds.,LNCS951,Springer-Verlag,Berlin/Hei-delberg/NewYork,132±151.,A.,H.-W.1991.Howtosplitbucketsinspatialdatastructures.IngraphicDatabaseManagementSystems,G.Gambosi,M.Scholl,andH.-W.Six,Eds.,Springer-Verlag,Berlin/Heidelberg/NewYork,,A.,S,H.-W.,P.1989.TheLSDtree:Spatialaccesstomultidimensionalpointandnon-pointobjects.ProceedingsoftheFifteenthInternationalConferenceonVeryLargeDataBases,45±53.,K.1985.Implementationofthegridfile:Designconceptsandexperience.BIT25,569±592.,E.G.,H.1992.Aqualitativecomparisonstudyofdatastructuresforlargesegmentdatabases.InProceedingsoftheACMSIGMODInternationalConferenceonManagementofData,205±214.,E.G.,H.1995.Benchmark-ingspatialjoinoperationswithspatialout-put.InProceedingsofthe21stInternationalConferenceonVeryLargeDataBases,606±,A.,S,H.-W.,,P.1988a.Globallyorderpreservingmultidi-mensionallinearhashing.InProceedingsoftheFourthIEEEInternationalConferenceonDataEngineering,572±579.,A.,S,H.-W.,,P.1988b.Twingridfiles:Spaceoptimizingac-cessschemes.InProceedingsoftheACMSIG-MODInternationalConferenceonManage-mentofData,183±190.,A.,S,H.-W.,,P.1990.TheR-file:Anefficientaccessstruc-tureforproximityqueries.InProceedingsoftheSixthIEEEInternationalConferenceonDataEngineering,372±379.,A.,W,P.,,C.1991.Globalordermakesspatialaccessfaster.InGeographicDatabaseManagement,G.Gambosi,M.Scholl,andH.-W.Six,Eds.,Springer-Verlag,Berlin/Heidelberg/NewYork,161±176..1997.TheDataBladearchitec-ture.URLhttp://www.informix.com.,H.V.1990a.Linearclusteringofob-jectswithmultipleattributes.InoftheACMSIGMODInternationalConfer-enceonManagementofData,332±342.,H.V.1990b.Onindexinglineseg-ments.InProceedingsoftheSixteenthInter-nationalConferenceonVeryLargeData,614±625.,H.V.1990c.Spatialsearchwithpolyhedra.InProceedingsoftheSixthIEEEInternationalConferenceonDataEngineer-,311±319.,I.,C.1992.ParallelR-trees.InProceedingsoftheACMSIGMODInternationalConferenceonManagementof,195±204.,I.,C.1993.OnpackingR-trees.InProceedingsoftheSecondInterna-tionalConferenceonInformationandKnowl-edgeManagement,490±499.,I.,C.1994.HilbertR-tree:AnimprovedR-treeusingfractals.InProceedingsoftheTwentiethInternationalConferenceonVeryLargeDataBases,500±,I.,K,M.,,V.1996.BulkinsertionindynamicR-trees.InProceedingsoftheSeventhInternationalSym-posiumonSpatialDataHandling(Delft,TheNetherlands),3B.31±3B.42.,P.C.,R,S.,VD.E.,,J.S.1993.Indexingfordatamodelswithconstraintsandclasses.InProceedingsoftheTwelfthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSystems,233±243.,G.1982.Thequad-CIFtree:Adatastructureforhierarchicalon-linealgorithms.ProceedingsoftheNineteenthConferenceonDesignandAutomation,352±357.,A.,M.1987.Ananaly-sisofgeometricmodelingindatabasesys-ACMComput.Surv.19,1,47±91.,A.1971.Patternandsearchstatis-MultidimensionalAccessMethods·227ACMComputingSurveys,Vol.30,No.2,June1998 tics.InOptimizingMethodsinStatistics,S.Rustagi,Ed.,303±337.,G.1975.Hashingfunctions.Comput.J.3,265±278.,C.1990.Indexingtechniquesformulti-dimensionalspatialdataandhistoricaldataindatabasemanagementsystems.Ph.D.Thesis,UniversityofCaliforniaatBerkeley.,C.,M.1991.Seg-mentindexes:Dynamicindexingtechniquesformulti-dimensionalintervaldata.InceedingsoftheACMSIGMODInternationalConferenceonManagementofData,138±147.,H.-P.1984.Performancecomparisonofindexstructuresformultikeyretrieval.InProceedingsoftheACMSIGMODInterna-tionalConferenceonManagementofData186±196.,H.-P.,H,P.,H,S.,S,M.,,R.1991.Anaccessmethodbasedqueryprocessorforspatialdatabasesystems.InGeographicDatabaseManage-mentSystems,G.Gambosi,M.Scholl,andH.-W.Six,Eds.,Springer-Verlag,Berlin/Hei-delberg/NewYork,273±292.,H.-P.,S,M.,S,R.,,B.1990.Performancecomparisonofpointandspatialaccessmethods.InandImplementationofLargeSpatialData-baseSystems,A.Buchmann,O.GuÈnther,T.R.Smith,andY.-F.Wang,Eds.,LNCS409,Springer-Verlag,Berlin/Heidelberg/NewYork,89±114.,H.-P.,B.1986.Multi-dimensionalorderpreservinglinearhashingwithpartialexpansions.InProceedingsoftheInternationalConferenceonDatabaseTheoryLNCS243,Springer-Verlag,Berlin/Heidel-berg/NewYork.,H.-P.,B.1987.Multi-dimensionalquantilehashingisveryefficientfornon-uniformrecorddistributions.InceedingsoftheThirdIEEEInternationalCon-ferenceonDataEngineering,10±17.,H.-P.,B.1988.PLOP-hashing:Agridfilewithoutdirectory.InceedingsoftheFourthIEEEInternationalConferenceonDataEngineering,369±376.,H.-P.,B.1989.Multi-dimensionalquantilehashingisveryefficientfornon-uniformdistributions.Inf.Sci.48,99±117.,M.,D.1995.High-con-currencylockinginR-trees.InProceedingsofthe21stInternationalConferenceonVeryLargeDataBases,134±145.,A.1994a.G-tree:Anewdatastructurefororganizingmultidimensionaldata.Trans.Knowl.DataEng.6,2,341±347.,A.1994b.Astudyofspatialclusteringtechniques.InProceedingsoftheFifthConfer-enceonDatabaseandExpertSystemsAppli-cations(DEXA'94),D.Karagiannis,Ed.,LNCS856,Springer-Verlag,Berlin/Heidel-berg/NewYork,57±70.,P.A.1980.Linearhashingwithpar-tialexpansions.InProceedingsoftheSixthInternationalConferenceonVeryLargeData,224±232.,P.,S.1981.EfficientlockingforconcurrentoperationsonB-trees.Trans.DatabaseSyst.6,4,650±670.,K.-I.,J,H.,,C.1994.TheTV-tree:Anindexstructureforhigh-dimensionaldata.VLDBJ.3,4,517±,W.1980.Linearhashing:Anewtoolforfileandtableaddressing.InProceedingsoftheSixthInternationalConferenceonVeryLargeDataBases,212±223.,M.,C.1994.Spatialjoinsusingseededtrees.InProceedingsoftheACMSIGMODInternationalConferenceonMan-agementofData,209±220.,D.B.1983.BoundexindexexponentialACMTrans.DatabaseSyst.8,136±165.,D.B.1991.Growandpostindextrees:Role,techniquesandfuturepotential.InvancesinSpatialDatabases,O.GuÈntherandH.Schek,Eds.,LNCS525,Springer-Verlag,Berlin/Heidelberg/NewYork,183±206.,D.B.,B.1989.ThehB-tree:Arobustmultiattributesearchstruc-ture.InProceedingsoftheFifthIEEEInter-nationalConferenceonDataEngineering296±304.,D.B.,B.1990.ThehB-tree:Amultiattributeindexingmethodwithgoodguaranteedperformance.ACMTrans.DatabaseSyst.15,4,625±658.ReprintedinReadingsinDatabaseSystems,M.Stone-braker,Ed.,Morgan-Kaufmann,SanMateo,CA,1994.,D.B.,B.1992.Accessmethodconcurrencywithrecovery.InceedingsoftheACMSIGMODInternationalConferenceonManagementofData,351±360.,H.,B.-C.1993.Spatialindexing:Pastandfuture.IEEEDataEng.Bull.16,16±21.,T.,H,L.V.,,M.1984.Afileorganizationforgeographicinformationsystemsbasedonspatialproximity.J.Comput.Vis.Graph.ImageProcess.26,,G.1966.Acomputerorientedgeodeticdatabaseandanewtechniqueinfilese-quencing.IBMLtd.,R.,H.1987.Apopulationanalysisforhierarchicaldatastructures.InProceedingsoftheACMSIGMODInterna-228·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 tionalConferenceonManagementofData270±277.,R.G.,M.1997.Discretege-ometrywithseamlesstopologyinaGIS.URL,R.T.,J.1994.Efficientandeffec-tiveclusteringmethodsforspatialdatamin-ing.InProceedingsoftheTwentiethInterna-tionalConferenceonVeryLargeDataBases144±154.,V.,T.1993.Concurrentac-cessestoR-trees.InAdvancesinSpatialDa-,D.AbelandB.C.Ooi,Eds.,LNCS692,Springer-Verlag,Berlin/Heidelberg/NewYork,142±161.,V.,T.1994.TheR-linktree:Arecoverableindexstructureforspatialdata.ProceedingsoftheFifthConferenceonDa-tabaseandExpertSystemsApplications,D.Karagiannis,Ed.,LNCS856,Springer-Verlag,Berlin/Heidelberg/NewYork,,J.1989.72criteriaforassessingandcomparingspatialdatastructures.InsignandImplementationofLargeSpatialDa-tabaseSystems,A.Buchmann,O.GuÈnther,T.R.Smith,andY.-F.Wang,Eds.,LNCS409,Springer-Verlag,Berlin/Heidelberg/NewYork,,J.,K.1987.Storageandaccessstructuresforgeometricdatabases.InProceedingsoftheInternationalConferenceonFoundationsofDataOrganiza-,S.Ghosh,Y.Kambayashi,andK.Tanaka,Eds.,Plenum,NewYork.,J.,H,H.,,K.1981.Thegridfile:Anadaptable,symmetricmultikeyfilestructure.InProceedingsoftheThirdECIConference,A.DuijvestijnandP.Lockemann,Eds.,LNCS123,Springer-Ver-lag,Berlin/Heidelberg/NewYork,236±251.,J.,H,H.,K.C.1984.Thegridfile:Anadaptable,symmetricmultikeyfilestructure.Trans.DatabaseSyst.9,1,38±71.,Y.,M.1983.BD-tree:A-dimensionaldatastructurewitheffi-cientdynamiccharacteristics.InoftheNinthWorldComputerCongress,IFIP,539±544.,Y.,M.1990.Anewtreetypedatastructurewithhomogeneousnodesuitableforaverylargespatialdatabase.InProceedingsoftheSixthIEEEInternationalConferenceonDataEngineering,296±303.,B.C.1990.EfficientQueryProcessinginGeographicInformationSystems.LNCS471,Springer-Verlag,Berlin/Heidelberg/NewYork.,B.C.,M,K.J.,R.1987.Spatialkd-tree:Anindexingmechanismforspatialdatabases.IningsoftheIEEEComputerSoftwareandAp-plicationsConference,433±438.,B.C.,S,R.,K.J.1991.Spatialindexingbybinaryde-compositionandspatialbounding.Inf.Syst.J.16,2,211±237.,P.1990.Reactivedatastructuresforgeographicinformationsystems.Ph.D.Thesis,UniversityofLeiden,TheNether-.1995.Oracle7multidimension:Advancesinrelationaldatabasetechnologyforspatialdatamanagement.Whitepaper.,J.1982.Multidimensionaltriesusedforassociativesearching.Inf.Process.Lett.14,4,150±157.,J.1983.Adynamicfileforrandomandsequentialaccessing.InProceedingsoftheNinthInternationalConferenceonVeryLargeDataBases,132±141.,J.1989a.Redundancyinspatialdatabases.InProceedingsoftheACMSIG-MODInternationalConferenceonManage-mentofData,294±305.,J.1989b.Strategiesforoptimizingtheuseofredundancyinspatialdatabases.InDesignandImplementationofLargeSpatialDatabaseSystems,A.Buchmann,O.GuÈnther,T.R.Smith,andY.-F.Wang,Eds.,LNCS409,Springer-Verlag,Berlin/Heidelberg/NewYork,,J.1990.Acomparisonofspatialqueryprocessingtechniquesfornativeandparameterspace.InProceedingsoftheACMSIGMODInternationalConferenceonMan-agementofData,343±352.,J.,T.H.1984.Aclassofdatastructuresforassociativesearching.ProceedingsoftheThirdACMSIGACT±SIGMODSymposiumonPrinciplesofData-baseSystems,181±190.,J.A.1986.Spatialqueryprocess-inginanobject-orienteddatabasesystem.InProceedingsoftheACMSIGMODInterna-tionalConferenceonManagementofData326±333.,E.J.1984.Amappingfunctionforthedirectoryofamultidimensionalextendiblehashing.InProceedingsoftheTenthInterna-tionalConferenceonVeryLargeDataBases,E.J.1985.Symmetricdynamicindexmaintenancescheme.InProceedingsoftheInternationalConferenceonFoundationsofDataOrganization,Plenum,NewYork,283±,E.J.1986.Balancedmultidimensionalextendiblehashtree.InProceedingsoftheFifthACMSIGACT±SIGMODSymposiumonPrinciplesofDatabaseSystems,100±113.,M.1985.TheinterpolationbasedgridMultidimensionalAccessMethods·229ACMComputingSurveys,Vol.30,No.2,June1998 file.InProceedingsoftheFourthACMSI-GACT±SIGMODSymposiumonPrinciplesofDatabaseSystems,20±27.,M.,P.1983.Stor-agemappingsformultidimensionallineardy-namichashing.InProceedingsoftheSecondACMSIGACT±SIGMODSymposiumonPrin-ciplesofDatabaseSystems,90±105.,M.A.,O.1992.Arobustandefficientspatialdatastructure.ActaInf.,M.H.,S,M.H.,B,T.,ANDVAN,M.J.1990.Maintainingrangetreesinsecondarymemory:PartI:Partitions.ActaInf.27,423±452.,B.U.,S,H.-W.,,H.1993a.Thetransformationtechniqueforspatialob-jectsrevisited.InAdvancesinSpatialData-,D.AbelandB.C.Ooi,Eds.,LNCS692,Springer-Verlag,Berlin/Heidelberg/NewYork,73±88.,B.U.,S,H.-W.,,M.1995.Windowqueryoptimalclusteringofspatialobjects.InProceedingsoftheFourteenthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSystems,86±94.,B.U.,S,H.-W.,T,H.,,P.1993b.Towardsananalysisofrangequeryperformanceinspatialdatastructures.InProceedingsoftheTwelfthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSystems,214±221.,D.,T,Y.,S,T.,,M.J.1995.Topologicalrela-tionsintheworldofminimumboundingrect-angles:AstudywithR-trees.InoftheACMSIGMODInternationalConfer-enceonManagementofData,92±103.,A.,Y.1997.PerformanceofnearestneighborqueriesinR-trees.InProceedingsoftheInternationalConferenceonDatabaseTheory(ICDT'97),F.AfratiandP.Kolaitis,Eds.,LNCS1186,Springer-Verlag,Berlin/Heidelberg/NewYork,394±408.,J.,R,G.,,M.1994.EvaluationofspatialindicesimplementedwiththeÂnieÁriedesSysteÁmesd'Information6,F.P.,M.I.1985.putationalGeometry.Springer-Verlag,New,M.1985.Analysisofthegridfilealgo-BIT25,335±357.,J.T.1981.TheK-D-B-tree:Asearchstructureforlargemultidimensionaldynamicindexes.InProceedingsoftheACMSIGMODInternationalConferenceonManagementof,10±18.,D.1991.Spatialjoinindices.InceedingsoftheSeventhIEEEInternationalConferenceonDataEngineering,10±18.,N.,D.1984.Anin-troductiontoPSQL:Apictorialstructuredquerylanguage.InProceedingsoftheIEEEWorkshoponVisualLanguages,N.,D.1985.DirectspatialsearchonpictorialdatabasesusingpackedR-trees.InProceedingsoftheACMSIGMODInternationalConferenceonMan-agementofData,17±31.,H.1994.Space-FillingCurves.Spring-er-Verlag,Berlin/Heidelberg/NewYork.,H.1984.Thequadtreeandrelatedhier-archicaldatastructure.ACMComput.Surv.2,187±260.,H.1988.Hierarchicalrepresentationofcollectionsofsmallrectangles.ACMComput.Surv.20,4,271±309.,H.1990a.ApplicationsofSpatialData.Addison-Wesley,Reading,MA.,H.1990b.TheDesignandAnalysisofSpatialDataStructures.Addison-Wesley,Reading,MA.,H.,R.E.1985.Storingacollectionofpolygonsusingquadtrees.Trans.Graph.4,3,182±222.,M.1993.Speicherungundanfrage-bearbeitungkomplexergeo-objekte.Ph.D.The-sis,Ludwig-Maximilians-UniversitaÈtMuÈnchen,Germany(inGerman).,R.,H.-P.1992.TheTR*-tree:Anewrepresentationofpolygonalobjectssupportingspatialqueriesandopera-tions.InProceedingsoftheSeventhWorkshoponComputationalGeometry,LNCS553,Springer-Verlag,Berlin/Heidelberg/NewYork,249±264.,M.,A.1989.Thematicmapmodeling.InDesignandImplementationofLargeSpatialDatabaseSystems,A.Buch-mann,O.GuÈnther,T.R.Smith,andY.-F.Wang,Eds.,LNCS409,Springer-Verlag,Ber-lin/Heidelberg/NewYork.,B.1991.Performancecomparisonofsegmentaccessmethodsimplementedontopofthebuddy-tree.InAdvancesinSpatial,O.GuÈntherandH.Schek,Eds.,LNCS525,Springer-Verlag,Berlin/Heidel-berg/NewYork,277±296.,B.,H.-P.1988.Tech-niquesfordesignandimplementationofspa-tialaccessmethods.InProceedingsoftheFourteenthInternationalConferenceonVeryLargeDataBases,360±371.,B.,H.-P.1990.Thebud-dy-tree:Anefficientandrobustaccessmethodforspatialdatabasesystems.InoftheSixteenthInternationalConferenceonVeryLargeDataBases,590±601.,T.,R,N.,,C.230·V.GaedeandO.GuÈntherACMComputingSurveys,Vol.30,No.2,June1998 1987.TheR-tree:Adynamicindexformulti-dimensionalobjects.InProceedingsoftheThirteenthInternationalConferenceonVeryLargeDataBases,507±518.,K.,N.1996.Filtertreesformanagingspatialdataoverarangeofsizegranularities.InProceedingsofthe22thIn-ternationalConferenceonVeryLargeData(Bombay),16±27.,A.P.1997.Queryingindexedfiles.InProceedingsoftheCDB'97andCP'96Work-shoponConstraintDatabasesandTheirAp-,V.Gaede,A.Brodsky,O.GuÈnther,D.Srivastava,V.Vianu,andM.Wallace,Eds.,LNCS1191,Springer-Verlag,Berlin/Heidelberg/NewYork,263±281.,S.,D.-R.1995.CCAM:Acon-nectivity-clusteredaccessmethodforaggre-gatequeriesontransportationnetworks:Asummaryofresults.InProceedingsoftheEleventhIEEEInternationalConferenceonDataEngineering,410±419.1997.URLhttp://www.sni.de.,H.,P.1988.Spatialsearch-ingingeometricdatabases.InProceedingsoftheFourthIEEEInternationalConferenceonDataEngineering,496±503.,M.H.,M.H.1990.Main-tainingrangetreesinsecondarymemorypartII:Lowerbounds.ActaInf.27,453±480.,T.R.,P.1990.Experimentalperformanceevaluationsonspatialaccessmethods.InProceedingsoftheFourthInter-nationalSymposiumonSpatialDataHan-Èrich),991±1002.,M.(E.)1994.ReadingsinData-baseSystems.Morgan-Kaufmann,SanMateo,,M.,S,T.,,E.1986.Ananalysisofruleindexingimple-mentationsindatabasesystems.IningsoftheFirstInternationalConferenceonExpertDataBaseSystems,P.1997.Constraintsearchtrees.InProceedingsoftheInternationalConferenceonLogicProgramming(CLP'97),L.Naish,Ed.,MITPress,Cambridge,MA.,S.,S.1995.TheP-rangetree:Anewdatastructureforrangesearchinginsecondarymemory.IningsoftheACM-SIAMSymposiumonDiscreteAlgorithms(SODA'95),M.1982.Theextendiblecellmethodforclosestpointproblems.BIT22,27±41.,M.1983.Performanceanalysisofcellbasedgeometricfileorganisations.J.Comp.Vis.Graph.ImageProcess.24,160±,M.1984.Commentonquad-andoc-Commun.ACM30,3,204±212.,Y.,T.K.1996.AmodelforthepredictionofR-treeperfor-mance.InProceedingsoftheFifteenthACMSIGACT±SIGMOD±SIGARTSymposiumonPrinciplesofDatabaseSystems,161±171.,H.,H.1981.Multi-dimensionalrangesearchindynamicallybal-ancedtrees.AngewandteInformatik2,,K.-Y.,R.1985.Multilevelgridfiles.IBMResearchLabora-tory,YorktownHeights,NY.,M.1981.N-trees:Largeorderedin-dexesformulti-dimensionalspace.Tech.Rep.,ApplicationMathematicsResearchStaff,Sta-tisticalResearchDivision,USBureauofthe,P.1991.DatenstrukturenfuÈrGeo-datenbanken.InEntwicklungstendenzenbei,G.VossenandK.-U.Witt,Eds.,Oldenbourg-Verlag,Munich,Chapter9,317±361(inGerman).ReceivedAugust1995;revisedAugust1997;acceptedJanuary1998MultidimensionalAccessMethods·231ACMComputingSurveys,Vol.30,No.2,June1998 MultidimensionalAccessMethodsVOLKERGAEDEIC-Parc,ImperialCollege,LondonOLIVERGUÈt,BerlinSearchoperationsindatabasesrequirespecialsupportatthephysicallevel.Thisistrueforconventionaldatabasesaswellasspatialdatabases,wheretypicalsearchoperationsincludethepointquery(findallobjectsthatcontainagivensearchpoint)andtheregionquery(findallobjectsthatoverlapagivensearchregion).Morethantenyearsofspatialdatabaseresearchhaveresultedinagreatvarietyofmultidimensionalaccessmethodstosupportsuchoperations.Wegiveanoverviewofthatwork.Afterabriefsurveyofspatialdatamanagementingeneral,wefirstpresenttheclassofpointaccessmethods,whichareusedtosearchsetsofpointsintwoormoredimensions.Thesecondpartofthepaperisdevotedtospatialaccessmethodstohandleextendedobjects,suchasrectanglesorpolyhedra.Weconcludewithadiscussionoftheoreticalandexperimentalresultsconcerningtherelativeperformanceofvariousapproaches.CategoriesandSubjectDescriptors:H.2.2[DatabaseManagement]:Physicalaccessmethods;H.2.4[DatabaseManagement]:Systems;H.2.8H.2.8DatabaseManagement]:DatabaseApplicationsÐspatialdatabasesandGISH.3.3[InformationStorageandRetrieval]:InformationSearchandRetrievalÐsearchprocess,selectionprocessGeneralTerms:Design,Experimentation,PerformanceAdditionalKeyWordsandPhrases:Datastructures,multidimensionalaccess1.INTRODUCTIONWithanincreasingnumberofcomputerapplicationsthatrelyheavilyonmulti-dimensionaldata,thedatabasecommu-nityhasrecentlydevotedconsiderableattentiontospatialdatamanagement.Althoughthemainmotivationorigi-natedinthegeosciencesandmechani-calCAD,therangeofpossibleapplica-tionshasexpandedtoareassuchasrobotics,visualperception,autonomousnavigation,environmentalprotection,andmedicalimaging[GuÈntherandBuchmann1990].Therangeofinterpretationgivento ThisworkwaspartiallysupportedbytheGermanResearchSociety(DFG/SFB373)andbytheESPRITWorkingGroupCONTESSA(8666).Authors'address:InstitutfuÈrWirtschaftsinformatik,Humboldt-UniversitaÈtzuBerlin,SpandauerStr.1,10178Berlin,Germany;email:Permissiontomakedigital/hardcopyofpartorallofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatthecopiesarenotmadeordistributedforprofitorcommercialadvantage,thecopyrightnotice,thetitleofthepublication,anditsdateappear,andnoticeisgiventhatcopyingisbypermissionoftheACM,Inc.Tocopyotherwise,torepublish,topostonservers,ortoredistributetolists,requirespriorspecificpermissionand/orafee.1998ACM0360-0300/98/0600±0170$05.00ACMComputingSurveys,Vol.30,No.2,June1998