Lowe Member IEEE Abstract For many computer vision and machine learning problems large training sets are key for good performance However the most computationally expensive part of many computer vision and machine learning algorithms consists of 642 ID: 5768 Download Pdf
Lowe Computer Science Department University of British Columbia Vancouver BC Canada mariusmcsubcca lowecsubcca Keywords nearestneighbors search randomized kdtrees hierarchical kmeans tree clustering Abstract For many computer vision problems the mos
Exact Nearest Neighbor Algorithms Sabermetrics One of the best players ever .310 batting average 3,465 hits 260 home runs 1,311 RBIs 14x All-star 5x World Series winner Who is the next Derek Jeter? Derek Jeter
LECTURE 10. Classification. . k-nearest neighbor classifier. . Naïve Bayes. . Logistic Regression. . Support Vector Machines. NEAREST NEIGHBOR CLASSIFICATION. Instance-Based Classifiers. Store the training records .
ℓ. p. –spaces (2<p<∞) via . embeddings. Yair. . Bartal. . Lee-Ad Gottlieb Hebrew U. Ariel University. Nearest neighbor search. Problem definition:. Given a set of points S, preprocess S so that the following query can be answered efficiently:.
Nearest . Neighbor Method . for Pattern . Recognition. This lecture notes is based on the following paper:. B. . Tang and H. He, "ENN: Extended Nearest Neighbor Method for . Pattern Recognition. ," .
In the first part we survey a family of nearest neighbor algorithms that are based on the concept of locality sensitive hashing Many of these algorithm have already been successfully applied in a variety of practical scenarios In the second part of
Milos. . Radovanovic. , . Alexandros. . Nanopoulos. , . Mirjana. . Ivanovic. . . ICML 2009. Presented by Feng Chen. Outline. The Emergence of Hubs. Skewness. in Simulated Data. Skewness. in Real Data.
We present two algorithms for rapid shape retrieval representative shape contexts performing comparisons based on a small number of shape contexts and shapemes using vector quantization in the space of shape contexts to obtain prototypical shape p
Neighbor. Search with Keywords. Abstract. Conventional spatial queries, such as range search and nearest . neighbor. retrieval, involve only conditions on objects' geometric properties. Today, many modern applications call for novel forms of queries that aim to find objects satisfying both a spatial predicate, and a predicate on their associated texts. For example, instead of considering all the restaurants, a nearest .
Mount Member IEEE Nathan S Netanyahu Member IEEE Christine D Piatko Ruth Silverman and Angela Y Wu Senior Member IEEE Abstract 57552In means clustering we are given a set of data points in dimensional space and an integer and the problem is to dete
Published byconchita-marotz
Lowe Member IEEE Abstract For many computer vision and machine learning problems large training sets are key for good performance However the most computationally expensive part of many computer vision and machine learning algorithms consists of 642
Download Pdf - The PPT/PDF document "Scalable Nearest Neighbor Algorithms for..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Formanycomputervisionandmachinelearningproblems,largetrainingsetsarekeyforgoodperformance.However,themostcomputationallyexpensivepartofmanycomputervisionandmachinelearningalgorithmsconsistsofÞndingnearestneighbormatchestohighdimensionalvectorsthatrepresentthetrainingdata.Weproposenewalgorithmsforapproximatenearestneighbormatchingandevaluateandcomparethemwithpreviousalgorithms.Formatchinghighdimensionalfeatures,weÞndtwoalgorithmsto referredtoasnearestneighbormatching.HavinganefÞ-cientalgorithmforperformingfastnearestneighbormatchinginlargedatasetscanbringspeedimprovements PsuchthattheoperationNN"q;P#canbeperformedefÞciently.WeareofteninterestedinÞndingnotjusttheÞrstclos-estneighbor,butseveralclosestneighbors.Inthiscase,thesearchcanbeperformedinseveralways,dependingonthenumberofneighborsreturnedandtheirdistancetothequerypoint:K-nearestneighbor(KNN)searchwherethegoalistoÞndtheclosestKpointsfromthequerypointandradiusnearestneighborsearch(RNN),wherethegoalistoÞndallthepointslocatedcloserthansomedistanceRfrom Nearest-neighborsearchisafundamentalpartofmanycomputervisionalgorithmsandofsigniÞcantimportanceinmanyotherÞelds,soithasbeenwidelystudied.Thissec-tionpresentsareviewofpreviousworkinthisarea.2.1NearestNeighborMatchingAlgorithmsWereviewthemostwidelyusednearestneighbortechni-ques,classiÞedinthreecategories:partitioningtrees,hash-ingtechniquesandneighboringgraphtechniques.2.1.1PartitioningTrees paperwedescribeamodiÞedk-meanstreealgorithmthatwehavefoundtogivethebestresultsforsomedatasets,whilerandomizedk-dtreesarebestforothers.J!egouetal.[27]proposetheproductquantizationapproachinwhichtheydecomposethespaceintolowdimensionalsubspacesandrepresentthedatasetspointsbycompactcodescomputedasquantizationindicesinthesesubspaces.ThecompactcodesareefÞcientlycomparedtothequerypointsusinganasymmetricapproximatedis-tance.BabenkoandLempitsky[28]proposetheinvertedmulti-index,obtainedbyreplacingthestandardquantiza-tioninaninvertedindexwithproductquantization,obtain- isimportantforobtaininggoodsearchperformance.InSection3.4weproposeanalgorithmforÞndingtheoptimumalgorithmparameters,includingtheoptimumbranchingfactor.Fig.3containsavisualisationofseveralhierarchicalk-meansdecomposi-tionswithdifferentbranchingfactors.Anotherparameteroftheprioritysearchk-meanstreeisImax,themaximumnumberofiterationstoperforminthek-meansclusteringloop.Performingfeweriterationscansubstantiallyreducethetreebuildtimeandresultsinaslightlylessthanoptimalclustering(ifweconsiderthesumofsquarederrorsfromthepointstotheclustercentresasthemeasureofoptimality).However,wehaveobservedthatevenwhenusingasmallnumberofiterations,thenear-estneighborsearchperformanceissimilartothatofthetreeconstructedbyrunningtheclusteringuntilconvergence,asillustratedbyFig.4.Itcanbeseenthatusingasfewasseveniterationswegetmorethan90percentofthenearest-neigh-borperformanceofthetreeconstructedusingfullconver-gence,butrequiringlessthan10percentofthebuildtime.Thealgorithmtousewhenpickingtheinitialcentresinthek-meansclusteringcanbecontrolledbytheCalgparame-ter.Inourexperiments(andintheFLANNlibrary)wehaveFig.3.Projectionsofprioritysearchk-meanstreesconstructedusingdifferentbranchingfactors:4,32,128.Theprojectionsareconstructedusingthesametechniqueasin[26],grayvaluesindicatingtheratiobetweenthedistancestothenearestandthesecond-nearestclustercentreateachtreelevel,sothatthedarkestvalues(ratio!1)fallneartheboundariesbetweenk-meansregions.Fig.4.Theinßuencethatthenumberofk-meansiterationshasonthesearchspeedofthek-meanstree.Figureshowstherelativesearchtimecomparedtothecaseofusingfullconvergence. distancestoalltheclustercentresofthechildnodes,anO!Kd"operation.Theunexploredbranchesareaddedtoapriorityqueue,whichcanbeaccomplishedinO!K"amor-tizedcostwhenusingbinomialheaps.FortheleafnodethedistancebetweenthequeryandallthepointsintheleafneedstobecomputedwhichtakesO!Kd"time.InsummarytheoverallsearchcostisO!Ld!logn=logK"".3.3TheHierarchicalClusteringTreeMatchingbinaryfeaturesisofincreasinginterestinthecom-putervisioncommunitywithmanybinaryvisualdescriptorsbeingrecentlyproposed:BRIEF[49],ORB[50],BRISK[51].Manyalgorithmssuitableformatchingvectorbasedfea-tures,suchastherandomizedkd-treeandprioritysearchk-meanstree,areeithernotefÞcientornotsuitableformatch-ingbinaryfeatures(forexample,theprioritysearchk-meanstreerequiresthepointstobeinavectorspacewheretheirdimensionscanbeindependentlyaveraged).BinarydescriptorsaretypicallycomparedusingtheHammingdistance,whichforbinarydatacanbecomputedasabitwiseXORoperationfollowedbyabitcountontheresult(veryefÞcientoncomputerswithhardwaresupportforcountingthenumberofbitssetinaword1).Thissectionbrießypresentsanewdatastructureandalgorithm,calledthehierarchicalclusteringtree,whichwefoundtobeveryeffectiveatmatchingbinaryfeatures.Foramoredetaileddescriptionofthisalgorithmthereaderisencouragedtoconsult[47]and[52].Thehierarchicalclusteringtreeperformsadecomposi-tionofthesearchspacebyrecursivelyclusteringtheinputdatasetusingrandomdatapointsastheclustercentersofthenon-leafnodes(seeAlgorithm3).Incontrasttotheprioritysearchk-meanstreepresentedabove,forwhichusingmorethanonetreedidnotbringsigniÞ-cantimprovements,wehavefoundthatbuildingmultiplehierarchicalclusteringtreesandsearchingtheminparallelusingacommonpriorityqueue(thesameapproachthathasbeenfoundtoworkwellforrandomizedkd-trees[13])resultedinsigniÞcantimprovementsinthesearchperformance.3.4AutomaticSelectionoftheOptimalAlgorithmOurexperimentshaverevealedthattheoptimalalgorithmforapproximatenearestneighborsearchishighlydepen-dentonseveralfactorssuchasthedatadimensionality,sizeandstructureofthedataset(whetherthereisanycorrela-tionbetweenthefeaturesinthedataset)andthedesiredsearchprecision.Additionally,eachalgorithmhasasetofparametersthathavesigniÞcantinßuenceonthesearchper-formance(e.g.,numberofrandomizedtrees,branchingfac-tor,numberofk-meansiterations).AswealreadymentioninSection2.2,theoptimumparametersforanearestneighboralgorithmaretypicallychosenmanually,usingvariousheuristics.Inthissectionweproposeamethodforautomaticselectionofthebestnearestneighboralgorithmtouseforaparticulardatasetandforchoosingitsoptimumparameters. beingacostfunctionindicatinghowwellthesearchalgorithmA,conÞguredwiththeparametersu,per-formsonthegiveninputdata.1.ThePOPCNTinstructionformodernx86_64architectures. Meaddownhillsimplexmethod[43]tofurtherlocallyexploretheparameterspaceandÞne-tunethebestsolutionobtainedintheÞrststep.Althoughthisdoesnotguaranteeaglobalminimum,ourexperimentshaveshownthattheparametervaluesobtainedareclosetooptimuminpractice.Weuserandomsub-samplingcross-validationtogener-atethedataandthequerypointswhenweruntheoptimiza-tion.InFLANNtheoptimizationcanberunonthefulldatasetforthemostaccurateresultsorusingjustafractionofthedatasettohaveafasterauto-tuningprocess.Theparameterselectionneedstoonlybeperformedonceforeachtypeofdataset,andtheoptimumparametervaluescanbesavedandappliedtoallfuturedatasetsofthesametype.4EXPERIMENTSFortheexperimentspresentedinthissectionweusedaselectionofdatasetswithawiderangeofsizesanddatadimensionality.AmongthedatasetsusedaretheWinder/Brownpatchdataset[53],datasetsofrandomlysampleddataofdifferentdimensionality,datasetsofSIFTfeatures thememoryusageisnotaconcern, thedatainthememoryofasinglemachineforverylargedatasets.StoringthedataonthediskinvolvessigniÞcantperformancepenaltiesduetotheperformancegapbetweenmemoryanddiskaccesstimes.InFLANNweusedtheapproachofperformingdistributednearestneighborsearchacrossmultiplemachines.5.1SearchingonaComputeClusterInordertoscaletoverylargedatasets,weusetheapproachofdistributingthedatatomultiplemachinesinacomputeclusterandperformthenearestneighborsearchusingallthemachinesinparallel.Thedataisdistributedequallybetweenthemachines,suchthatforaclusterofNmachineseachofthemwillonlyhavetoindexandsearchofthewholedataset(althoughtheratioscanbechangedtohavemoredataonsomemachinesthanothers).TheÞnalresultofthenearestneighborsearchisobtainedbymergingthepartialresultsfromallthemachinesintheclusteroncetheyhavecompletedthesearch.InordertodistributethenearestneighbormatchingonacomputeclusterweimplementedaMap-Reducelikealgo-rithmusingthemessagepassinginterface(MPI)speciÞcation. theexperimentonasinglemachine.Fig.15showstheperformanceobtainedbyusingeightparallelprocessesonone,twoorthreemachines.Eventhoughthesamenumberofparallelprocessesareused,itcanbeseenthattheperformanceincreaseswhenthosepro-cessesaredistributedonmoremachines.Thiscanalsobeexplainedbythememoryaccessoverhead,sincewhenmoremachinesareused,fewerprocessesarerunningoneachmachine,requiringfewermemoryaccesses. ,2005,vol.1,pp.26Ð33.[7]A.Torralba,R.Fergus,andW.T.Freeman,Ò80milliontiny ,2009,pp.248Ð255.[9]J.L.Bentley,ÒMultidimensionalbinarysearchtreesusedforasso-ciativesearching,ÓCommun.ACM,vol.18,no.9,pp.509Ð517,1975.[10]J.H.Friedman,J.L.Bentley,andR.A.Finkel,ÒAnalgorithmforÞndingbestmatchesinlogarithmicexpectedtime,ÓACMTrans.Math.Softw.,vol.3,no.3,pp.209Ð226,1977.[11]S.Arya,D.M.Mount,N.S.Netanyahu,R.Silverman,andA.Y.Wu,ÒAnoptimalalgorithmforapproximatenearestneighborsearchinginÞxeddimensions,ÓJ.ACM,vol.45,no.6,pp.891Ð923,1998.[12]J.S.BeisandD.G.Lowe,ÒShapeindexingusingapproximatenearest-neighboursearchinhigh-dimensionalspaces,ÓinProc.IEEEConf.Comput.Vis.PatternRecog.,1997,pp.1000Ð1006.[13]C.Silpa-AnanandR.Hartley,ÒOptimisedKD-treesforfastimage ,2008,pp.537Ð546.[17]Y.Jia,J.Wang,G.Zeng,H.Zha,andX.S.Hua,ÒOptimizingkd-treesforscalablevisualdescriptorindexing,ÓinProc.IEEEConf.Comput.Vis.PatternRecog.,2010,pp.3392Ð3399.[18]K.FukunagaandP.M.Narendra,ÒAbranchandboundalgo- [23]T.Liu,A.Moore,A.Gray,K.Yang,ÒAninvestigationofpracticalapproximatenearestneighboralgorithms,ÓpresentedattheAdvancesinNeuralInformationProcessingSystems,Vancouver,BC,Canada,2004.[24]D.NisterandH.Stewenius,ÒScalablerecognitionwithavocabu-larytree,ÓinProc.IEEEConf.Comput.Vis.PatternRecog.,2006,pp.2161Ð2168.[25]B.Leibe,K.Mikolajczyk,andB.Schiele,ÒEfÞcientclusteringandmatchingforobjectclassrecognition,ÓinProc.BritishMach.Vis.Conf.,2006,pp.789Ð798.[26]G.Schindler,M.Brown,andR.Szeliski,ÒCity-Scalelocationrec-ognition,ÓinProc.IEEEConf.Comput.Vis.PatternRecog.,2007,pp.1Ð7.[27]H.J!egou,M.Douze,andC.Schmid,ÒProductquantizationfornearestneighborsearch,ÓIEEETrans.PatternAnal.Mach.Intell.,vol.32,no.1,pp.1Ð15,Jan.2010.[28]A.BabenkoandV.Lempitsky,ÒTheinvertedmulti-index,ÓinProc.IEEEConf.Comput.Vis.PatternRecog.,2012,pp.3069Ð3076.[29]A.AndoniandP.Indyk,ÒNear-optimalhashingalgorithmsforapproximatenearestneighborinhighdimensions,ÓCommun.ACM,vol.51,no.1,pp.117Ð122,2008.[30]Q.Lv,W.Josephson,Z.Wang,M.Charikar,andK.Li,ÒMulti-probeLSH:EfÞcientindexingforhigh-dimensionalsimilaritysearch,ÓinProc.Int.Conf.VeryLargeDataBases Comput.Vis.PatternRecog.,2012,pp.1106Ð1113.[43]J.A.NelderandR.Mead,ÒAsimplexmethodforfunctionmini-mization,ÓComput.J.,vol.7,no.4,pp.308Ð313,1965.[44]F.Hutter,ÒAutomatedconÞgurationofalgorithmsforsolvinghardcomputationalproblems,ÓPh.D.dissertation,Comput.Sci.Dept.,Univ.BritishColumbia,Vancouver,BC,Canada,2009.[45]F.Hutter,H.H.Hoos,andK.Leyton-Brown,ÒParamILS:Anauto-maticalgorithmconÞgurationframework,ÓJ.Artif.Intell.Res. [47]M.Muja,ÒScalablenearestneighbourmethodsforhighdimen-sionaldata,ÓPh.D.dissertation,Comput.Sci.Dept.,Univ.BritishColumbia,Vancouver,BC,Canada,2013.[48]D.ArthurandS.Vassilvitskii,ÒK-Means++:Theadvantagesofcarefulseeding,ÓinProc.Symp.DiscreteAlgorithms,2007,pp.1027Ð1035.[49]M.Calonder,V.Lepetit,C.Strecha,andP.Fua,ÒBRIEF:Binary 2011,pp.2548Ð2555.[52]M.MujaandD.G.Lowe,ÒFastmatchingofbinaryfeatures,ÓinProc.9thConf.Comput.RobotVis.,2012,pp.404Ð410.[53]S.WinderandM.Brown,ÒLearninglocalimagedescriptors,ÓinProc.IEEEConf.Comput.Vis.PatternRecog.,2007,pp.1Ð8.[54]K.MikolajczykandJ.Matas,ÒImprovingdescriptorsforfasttree
© 2021 docslides.com Inc.
All rights reserved.