/
ARTICLEINPRESS ARTICLEINPRESS

ARTICLEINPRESS - PDF document

test
test . @test
Follow
389 views
Uploaded On 2016-05-20

ARTICLEINPRESS - PPT Presentation

PleasecitethisarticleasAnnaBoschetalAreviewWhichisthebestwaytoorganizeclassifyimagesbycontentImageandVisionComputing2006doi101016jimavis200607015 Subblockstheimageis ID: 328076

Pleasecitethisarticleas:AnnaBosch etal. Areview:Whichisthebestwaytoorganize/classifyimagesbycontent? ImageandVisionComputing(2006) doi:10.1016/j.imavis.2006.07.015 Sub-blocks:theimageis

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "ARTICLEINPRESS" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 Sub-blocks:theimageis“rstpartitionedintoseveralblocks,andthenfeaturesareextractedfromeachofthoseblocks.Inthissectionareviewofthemostrecentandrepresen-tativeproposalsofbothglobalandsub-blockapproachesispresented.2.1.GlobalVailayaetal.[10…12]considerthehierarchicalclassi“ca-tionofvacationimages,andshowthatlow-levelfeaturescansuccessfullydiscriminatebetweenmanyscenestypesusingahierarchicallystructure.UsingbinaryBayesianclassi“ers,theyattempttocapturehigh-levelconceptsfromlow-levelimagefeaturesundertheconstraintthatthetestimagebelongstooneoftheclasses.Atthehighestlevel,imagesareclassi“edasindoororoutdoor;outdoorimagesarefurtherclassi“edascityorlandscape;“nally,asubsetoflandscapeimagesisclassi“edintosunset,forest,andmountainclasses.Dierentqualitativemeasures,extractedfromthewholeimage,areusedateachleveldependingontheclassi“cationproblem:indoor/outdoor(usingspatialcolourmoments);city/landscape(edgedirectioncoherencevectors),andsoon.Theclassi“cationproblemisaddressedbyusingBayesdecisiontheory.Eachimageisrepresentedbyafeaturevectorextractedfromtheimage.Theprobabi-listicmodelsrequiredfortheBayesianapproachareesti-matedduringatrainingstep.Considertrainingsamplesfromaclass.Avectorquantiserisusedtoextractcode-bookvectors,,fromthetrainingsamples.Theclass-conditionaldensityofafeaturevectorgiventheclass,i.e.,),isthenapproximatedbyamixtureofGaus-sians(withidentitycovariancematrices),eachcenteredatacodebookvector,resultingin:istheproportionoftrainingsamplesassignedto.TheBayesianclassi“eristhende“nedusingthemaxi-mumaposteriori(MAP)criterionasfollows:argmaxÞg¼argmaxÞgðisthesetofpatternclassesand)representstheaprioriclassprobability.Theproposalreportsanexcellentperformanceateachlevelofthehierarchyoverasetof6931images.However,itsuersalimitationinherenttohierarchicalclassi“ersthatisthecascadingoferrors.Toclassifyatestimage,forexampleaforest,intoacategoryimpliesthatwehavetosuccessfullyclassifytheimageatseveralstages(1)outdoor,(2)landscape,and(3)forest,withtheprobabilityofmissingateachlevel.Andobviouslyaninitialmistakecannotbesolvedatlowerlevels.Alsoininglobalfeaturesareusedtoproduceasetofsemanticallabelswithacertainbeliefforeachimage.Theymanuallylabeleachtrainingimagewithasemanticlabelandtrainclassi“ers(oneforeachsemanticlabel)usingsupportvectormachines(SVM).Eachtestimageisclassi-“edbytheclassi“ersandassignedacon“dencescoreforthelabelthateachclassi“erisattemptingtopredict.Asaresult,a-narylabel-vectorconsistingof-classmember-shipisgeneratedforeachimage.ThisapproachisspeciallyusefulforContentBasedImageRetrieval(CBIR)andRel-evanceFeedback(RF)systems.Otherauthorshavefol-lowedthisglobalapproach,althoughtheyhavetakenotheraspectsintoaccount.Forexample,Shenetal. Low-levelModelling SemanticModelling Global Sub-Blocks ConceptsPropertiesInput image Fig.2.Sceneclassi“cationapproaches.Low-levelorsemanticmodelling,isthemainpropertytodistinguishbasicstrategiestotackletheproposeclassi“cation.Severalapproacheshavebeenidenti“edinbothmainstrategiesdependingonhowtheyachievethe“nalsceneclassi“cation.A.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 teriorprobabilitycanbeexpressedbythejointprobability,whichcanbefurtherexpressedbytheconditionalandpriorprobabilities: ðS;EÞPðEÞ¼ wheredenotesthesemantictaskanddenotesevidence.Theecacyofthisframeworkisdemonstratedviathreeapplicationsinvolvingsemanticunderstandingofpictorialimages:(i)detectionofthemainphotographicsubjectsinanimageimage,(ii)selectingthemostappealingimageinanevent,and(iii)classifyingimagesintoindoororoutdoorscenes.Thislastapplicationrefersspeci“callytotheprob-lemofsceneclassi“cationion.Theperformanceisquantit-ativellyevaluatedusingonlylow-levelfeatures(OhtacolourspacehistogramsandMSARtexturefeaturesasinin),andincorporatingsemanticfeatures(skyandgrassobjects).Theydemonstratethattheclassi“cationperfor-mancecanbesigni“cantlyimprovedwhensemanticfea-turesareemployedintheclassi“cationprocess.Aksoyetal.al.alsoappliedaBayesianframeworkinavisualgrammar.Scenerepresentationisachievedbydecom-posingtheimageintoprototyperegionsandmodellingtheinteractionsbetweentheseregionsintermsoftheirspatialrelationships.Initiallyanimagesegmentationisperformedusingaclassicalsplit-and-mergealgorithm.Then,thetech-niqueautomaticallylearnsrepresentativeregiongroupswhichdiscriminatedierentscenesandbuildsvisualgram-marmodels.Similarly,inin,aftersegmentingtheimageintoregions,featuresareextractedandregionsclassi“ed.Finally,basedonthislocalclassi“cationthealgorithmclas-si“estheentireimage.Theirmaincontributionisthattheyfoundthattheadditionofeigenregions(theprincipalcom-ponentsoftheintensityoftheregion)tothefeaturevectorimprovesregionclassi“cationresultsandfurthermoretheimageclassi“cationrates.AsimilarapproachwasproposedbyMojsilovicetal.al.whereauthors“rstsegmenttheimageusingcolourandtextureinformationto“ndthesemanticindicators(e.g.skin,sky,water,etc.).Then,theseobjectsareusedtoidentifythesemanticcategories(i.e.peo-ple,outdoor,landscapes,etc.).Finally,wecanalsoincludeinthisapproachthepropos-alofVogelandSchiele[27,28],althoughinthiscasethesegmentationisperformedbyasimplespatialgridlayoutwhichsplitstheimageintoregularsubregions.Thetech-niqueusesbothcolourandtexturetoperformlandscapesceneclassi“cationandretrievalbasedonatwo-stagesys-tem.First,theimageispartitionedinto1010subregions,andeachoneisclassi“edusingK-NNorSVM.Animageisthenrepresentedbyaso-calledconceptoccurrencevector(COV),whichmeasuresthefrequencyofdierentobjectsinaparticularimage.TheaverageCOVoverallmembersofacategoryde“nesthecategoryprototype( referstooneofthescenecategoriesandtothenumberofimagesinthatcategory.Giventhisimagerepre-sentation,aprototypicalrepresentationforeachscenecat-egorycanbelearnt.Sceneclassi“cationiscarriedoutbyusingtheprototypicalrepresentationitselforMulti-SVM3.2.LocalsemanticconceptsInthelastyearswecan“ndintheliteratureonsceneclassi“cation,anincreasingnumberofproposalswhichmakeuseoflocalsemanticconcepts.Hence,anintermedi-arysemanticlevelrepresentationisintroducedasa“rststepbetweenimagepropertiesandsceneclassi“cationinordertodealwiththesemanticgapbetweenlow-levelfea-turesandhigh-levelconcepts.Nevertheless,allthesepro-posalsdonotrelyonaninitialsegmentation.Otherwise,thecontentofthesceneisdescribedbylocaldescriptors,forexamplecodewords[29…32]asshowninFig.4.Thesemethodshaveincommonthatworkoverthebag-of-wordsatechniqueusedforthestatisticaltextanalysis.3.2.1.Bag-of-wordsThebag-of-wordsmethodologywas“rstproposedfortextdocumentanalysisandfurtheradaptedforcomputervisionapplications.Themodelsareappliedtoimagesbyusingaanalogueofa,formedbyvectorquan-tisingvisualfeatures(colour,texture,etc.)likeregiondescriptors.Recentworkshaveshownthatlocalfeaturesrepresentedbybags-of-wordsaresuitableforsceneclassi“-cationshowingimpressivelevelsofperformance[33…36] Fig.4.Localdescriptorsrepresentedby55patchesatgreyscale.A.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 showsthatthespatialenvelopepropertyisesti-matedbyadotproductbetweentheamplitudespectrumoftheimageandatemplateDST().TheDST(discrim-inantspectraltemplate)isafunctionthatdescribeshoweachspectralcomponentcontributestoaspatialenvelopeproperty.TheDSTisparametrizedbythecolumnvectorwhichisdeterminedduringalearningstage.Asim-ilarestimationcanbeperformedwhenusingthespecto-gramfeatureswTheWDST(windoweddiscriminantspectraltemplate)de-scribeshowthespectralcomponentsatdierentspatiallocationscontributetoaspatialenvelopeproperty.Theperformanceofthespatialenvelopemodelshowsthatspe-ci“cinformationaboutobjectshapeoridentityisnotarequirementforscenecategorisationandthatmodellingaholisticrepresentationofthesceneinformsaboutitsprob-ablesemanticcategory.arethenumberoffunc-tionsusedfortheapproximationsanddeterminethedimensionalityofeachrepresentationand)arebasisoftheenergyspectrum.4.EvaluationArobustandobjectivemethodologyfortheevaluationofexistingapproachestosceneclassi“cationisneededinordertodiscernthebestmethodforanspeci“capplication“eld.Thus,albeitnecessary,isnotatrivialtaskduetotheheterogenousdataandclassi“cationimplementations,andhasoftenbeenmisregardedintheexistingliteratureonthatspeci“ctopic.Proposalsdieronobjectivestheytrytosat-isfy(e.g.numberandkindofscenestoclassify),andtheimagedataovertheyworkwith(speciallyconstrainedinsomecases).Furthermore,testdetailsashowtheimagesweresplitintotrainingandtestsetsareoftennotspeci“edinpublishedworks.Hence,unlessagivensystemisimple-mentedandtestedforspeci“cimagedata,itisverydiculttoevaluatefromthepublishedworkshowwellitwouldworkforthatdata.Itisouraimhere,toprovideandevaluationofalthoughnotallexistingmethodologies,themustrepresentativeworksderivedfromeachcriteriareviewedsofar.Wedesignedandimplementedthreealgorithmsrepresentativeofthemainapproachesidenti“edinthiswork.Wethentest-edthemoverthesamedatasetusedbyVogelandSchielehiele,whichallowedustocomparetheirperformancetotheexistingresultspublished.Wecomparedtheresultsonsceneclassi“cationobtainedbyfourdierentmethodsmen-tionedabove:(i)low-levelimagerepresentation(LLI),(ii)low-levelblockrepresentation(LLB),(iii)imagesegmenta-tionbyclassifyingpresentobjects(IS)and(iv)bag-of-wordsmodelusingpLSA(BOW).TheVogelandSchieleieledatasetusedincludes700naturalscenesfromtheCorelDatabaseconsistingofsixcategories:144coasts,103for-ests,179mountains,131opencountry,111riverand34sky/clouds.Thesizeoftheimagesis720480(landscapeformat)or480720(portraitformat).Everyscenecatego-ryischaracterisedbyahighdegreeofdiversityandpresentspotentialambiguitiessinceitdependsstronglyonthesub-jectiveperceptionoftheviewer.ForexampleinFig.7thethreescenescouldbealsolabelledasforestsomeone,yetthereisalsoaforestintheseimages.More-over,wealsoevaluatedthecomputationalcostofthemethods.4.1.FeaturesandmethodologyWeused600randomlyselectedtrainingimagesandtherestfortestingasinin.Featuresusedareaconcatenationofan84HSIhistogram(with36binsfor,32binsforand16binsfor),24featuresofthegray-levelco-occur-rencematrices(32graylevels):contrast,energy,entropy,homogeneity,inversedierencemoment,andcorrelation,forthedisplacements1,and,anda72-binedgedirectionhistogram.The“nalfeaturevectoristhen180-dimensional.Moreover,wehaveevaluatedopti-mumparametersvaluesforeachtechnique.Here,onlythebestresultsobtainedwiththesesparametervaluesareshown.Themethodologiesforeachstrategyarethefollowing:Thealgorithmcomputesglobalfeaturesforeachtrainingimage,theneachimageisrepresentedbya180-dimensionalvector.Atestimageisclassi“edusingK-NN(withK=10).LLB:ThealgorithmextractsvectorfeaturesforeachblockinthetrainingimagefollowingthestrategyproposedbySzummerandPicardPicard.Wedividedtheimageinto22and4blocks.Eachblockfromthetestimageisclassi-“edusingK-NN(withK=10)andthencombin-ingtheseresultsweclassifytheimagebyamajorityvoting.IS:Thisisthemethodimplementedinin.They“rstclassifyeachimagepatch(1010grid)providingfromacertainobjectandusingtheobjectdistributiontheimageclassi“cationiscarriedout.Authorsinnworkedwiththesamedatasetandfeaturesastheusedhereinforevaluation.Thuswehaveusedtheirpublishedperformancetocompareittootherapproachesinthispaper.BOW:5squareneighbourhoodaroundapixelisusedtocomputethefeaturevector.Thepatchesarespacedby3pixelsonaregulargrid.Inthiscasewehavelotsoffeaturevectorsandwequantizethemusingthek-meansalgorithmtoformthevisualA.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 vocabulary.ThenweclassifyeachimageusingpLSA.Weused=700forthek-meansalgorithm(vectorquantization)and=25whenrunningpLSA.Theclassi“cationtaskistoassigneachtestimagetooneofthesixcategories.Theperformanceismeasuredusingaconfusiontable,andoverallperformanceratesaremea-suredbytheaveragevalueofthediagonalentriesoftheconfusiontable.4.2.Classi“cationresultsClassi“cationresultsareshowninTable1.Asitisclearlystated,worseresultswereobtainedbylow-levelapproaches.A53.25%ofcorrectclassi“cationwasachievedbytheLLIalgorithm,whileapoor49.12%wasreachedbytheLLBalgorithm.Alow-levelstrategy,whichconsidersthesceneasanindividualobject,isnormallyusedtoclassifyonlyasmallnumberofscenecategories(indoorversusoutdoor,cityversuslandscape,etc.).Thesixcategoriesconsideredinourexperimentsaretoocomplextobedistinguishedbylow-levelsceneproperties.IfwelookatFig.7a,thegroundtruthofleftimageiswhilethegroundtruthoftheotheronesisHowevertheircolourandtexturedistributionisverysim-ilar(Fig.7bshowstheHSIhistograms)andlow-levelmethodLLIfailswhenittriestoclassifyscenes,clas-sifyingthemasforestscenes.SomethingsimilarhappensFig.7c,whereimagesareconfusedasmountainInfact,thesetofimagesandcategoriesusedbymostoftheauthorsareoftenconstrained.Asanexample,thecat-egoriesusedbyVaiyalaVaiyalawerechosenspeci“callytobenicelyseparable.Thesameauthorrecognisedinin:wethusrestrictedclassi“cationoflandscapeimagesintothreeclassesthatcouldbemoreunambiguouslydistinguished,namelysunset,forest,andmountainclasses.Sunsetscenescanbecharacterisedbysaturatedcolours(red,orangeoryellow),forestsceneshavepredominatelygreencolourdis-tributionduetothepresenceofdensetreesandfoliage,andmountainscenescanbecharacterisedbylongdistanceshotsofmountains.Incontrast,whenusinglocalsemanticconcepts,orobjectsegmentation,wecandealwithobjects(orconcepts)intheimages,andclassifytheminaneasyway,according bcd MountainCoast Fig.7.SometypicalscenesconfusedwhenusingglobalmethodLLIforclassi“cation.(a)riverimagesconfusedasforest,(c)coastimagesconfusedasmountain.(bandd)HSIhistogramsfromtheaboveimages. Table1Performanceofthecomparedapproachesoverasamedataset.LLI,LLB,BOWhavebeenimplementedbyourselves,whileISperformanceisthescorepublishedininLLI(%)LLB(%)IS(%)Bow(%)53.2549.1274.1076.92A.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 totheirdistribution.Semantictechniqueswhichmakeuseofanintermediaterepresentationachievedbesttheresults,scoresfrom74%toalmost77%havebeenobtained(seeTable1).Theycandealwiththepartsoftheimagethatcorrespondto,andtheonesthatcorrespondtothe,anditwillallowtodistinguishbetweenforestscenes(seeFig.8).Notethatwehavenotprovidedacomparisonforthesemanticproperties-basedmethodol-ogies,howeverithasbeendemonstratedininthatbag-of-wordsmethodsperformbetterthantherepresentativeproposalofOlivaandTorralbaTorralba.ComputationalCost.Low-levelstrategieshavetwoclearadvantages:theirsimplicityandtheirlowcomputationalcost.Overthesetof600trainingimages,3minareneededtoconstructtheclassi“erwhenusingLLI,and6minand40swhenLLB.ThiscomputationalcostismuchhigherwhenusingBOW.Thisisbecauseweusemoreinformationfromtheimages.Weneedapreprocessingsteptoconstructthevisualvocabularywhichisabitexpensive:around4hextracting6400descriptorsperimageandrunningk-meanswith700clusters.Thestepfor“ttingpLSAtakes10min.However,thecoststoclassifyatestimagearecomparable:2and7sforLLIandLLBrespectivelyand15sforBOW.Authorsinndidnotgivethecomputationalcostoftheiralgorithm.Allaboveexperimentshavebeendoneona1.7GHzComputerandMatlabimplementation.4.3.DiscussionAlthoughlow-levelstrategiespresentalowestcomputa-tionalcost,theyhaveapoorperformance,becausetheyareunabletodistinguishbetweencomplexscenes.Hierarchicschemeshavebeenproposedtoovercomethisdrawback,howeverourresultsseemtocorroboratetheinappropriateofthesemethodswhenthenumberofcategoriesisincreased.Ontheotherhand,thebestresultshavebeenobtainedwhenusingLocalSemanticConceptswiththebag-of-wordsandpLSAmethod(76.92%).Besides,thisapproachhasanice,veryrelevantproperty;localsemanticapproachesarealsotheoneswhichrequirelessuserinterventiontolearnintermediaterepresentations:theydirectlylearnfromthedatabyanunsupervised(e.g.[35,33])orsemi-supervisedvised)way.Contrarily,amainrequirementoftheothersemanticmodellingapproachesisthemanualannotationoftheseproperties.InOlivaandTorralbaworkwork,humansubjectsareinstructedtorankeachofthehundredsoftrain-ingscenesinto6dierentproperties.InIn,humansubjectsareaskedtoclassifynear60,000localpatchesfromthetrain-ingimagesintoninedierentsemanticconcepts.Bothcasesinvolvestensofhoursofmanuallabelling.Hence,adrawbackofthesestrategiesistheirpreprocessingcost,althoughthisstepcouldbedoneo-line.Focusingonthesemanticapproaches,themaindrawbackwhenusingsegmentationtechniques(IS)respecttotheoneswhichuselocalsemanticconcepts(BOW),isprobablytheaccuracyofthesegmentationmethod.Ifobjectsarenotwellsegmentedalltheposteriorclassi“cationstageswillprobablyfail.Incontrast,whenusinglocalsemanticconceptstheseg-mentationprocessisomittedandtheimageisclassi“edlook-ingatthelocalpatches.Furthermore,anotherimportantfeature,tounderstandthebestresultsobtainedbyBOWtechnique,isprobablyitsfreedomtochooseappropriateconcepts(=25)foradataset.Thesystemorganisestheminhisownwayinordertohavedierentobjectrepresenta- b 0,20,40,60,8135791113151719z (topics)P(z|d) 0,20,40,60,8135791113151719z (topics)P(z|d) 0,20,40,60,8135791113151719z (topics)P(z|d) 0,20,40,60,8135791113151719z (topics)P(z|d) Fig.8.Somescenesconfusedwhenusinglow-levelmethodsandwellclassi“edwhenusinglocalsemanticconcepts.(a)originalimage,(b)topicsassignedbypLSAtoeachpatch,(c)topicdistribution…)…foreachimage.Notethatpreviousconfusedimages(seetext)havedierentdistributionsaccordingtoitskindofscene,whichallowstoobtainabetterclassi“cationrate.A.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 Table2Summaryoftheanalysedsceneclassi“cationsystemsAuthorObjectsScenesFeatures#scenesLow-levelstrategiesGlobalVaialyaetal.[10,12]…Bayesianclassi“ersLUVandHSVcolorspace(spatialmoments,histograms,coherencevectors);MSAR;edgedirectionshistograms,coherencevectors5:indoor,city,sunset,forestandmountainChangetal.al.…SVM;BayespointmachineColor,texture15:architecture,bears,clouds,elephants,fabric,“reworks,”owers,food,landscape,objectimages,people,texture,tigers,tools,andwavesShenetal.al.…SVM;K-NN;GMMColor;texture;shape10:naturalscenery,architecture,plants,animals,rocks,”ags,buses,food,humanfacesandrosesSub-blocksSzummerandPiccardccard…K-NN;3-layerNN;mixtureofExpertsclassi“erOhtacolorspace;MSAR;shift-invariant2:indoorandoutdoorSerranoetal.al.…SVMLSTcolorspace;wavelettexture2:indoorandoutdoorSemanticstrategiesObjectsFanetal.al.SVMBayesianframeworkcoverageratio,regioncenter,regionrectangularbox,Tamuratexture,wavelettexture/colorLUV,6:mountain,view,beach,garden,saliling,skiinganddesertLuoetal.[21,23]K-NNBayesianNetworkOhtaColorspace,MSAR2:indoor,outdoorFredembachetal.al.MultivariategaussinanalysisbasedonthemaximumaposterioriruleProbabilisticmethodRGB,Lab,co-occurrencematrix,amplitudespectrumofthefouriertransform3:vegetation,skyandskinMojsilovicetal.al.NaiveBayesclassi“erBayesianframeworkregionspatialrelationshipsPortraits,people,outdoor,crowes,city,indoor,lanscapes,etc.VogelandSchiele[27,28]K-NN;SVMM-SVM;SSDbetweencategoryprototypesColor(HSV,RGB)andedgehistograms;co-occurrencematrix6:sky,coast,mountains,“eld,riverandConceptsFei-Feietal.al.ImplicitwiththemethodBag-of-wordsandLDAextensionDenseSIFTandgrayleveldescriptorsonaregulargrid13:forest,coast,mountain,opencountry,street,insidecity,tallbuildings,highway,bedroom,suburb,livingroom,kitchenandoceQuelhasetal.al.Implicitwiththemethodbag-of-wordsandpLSAsparseSIFTaroundinterestpoints3:indoor,cityandlandscapeBoschetal.al.ImplicitwiththemethodBag-of-wordsandpLSADenseSIFTonaregulargrid…concentricpatchesallowscalevariationUpto13:thesameasinnPerroninetal.al.ImplicitwiththemethodBag-of-wordsandGMMdenseSIFTonaregulargridUpto10:Africa,beach,buildings,buses,dinousaurs,elephants,”owers,horses,mountainsandfoodLazebniketal.al.ImplicitwiththemethodBag-of-wordsandpyramidkernelsDenseSIFTonaregulargrid15:thesameasinnplusindustrialandPropert.OlivaandTorralba[18,44,45]ImplicitwiththemethodK-NNSpatialenvelope(DSTWDST)4man-made:street,highway,tallbuildingsandinsidecity;4natural:forest,coast,mountainandopencountryA.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015 [21]J.Luo,A.E.Savakis,A.Singhal,Abayesiannetwork-basedframeworkforsemanticimageunderstanding,PatternRecognition38(2005)919…934.[22]J.Luo,A.Singhal,S.P.Etz,R.T.Gray,Acomputationalapproachtodeterminationofmainsubjectregionsinphotographicimages,ImageandVisionComputing22(2004)227…241.[23]J.Luo,A.Savakis,Indoorvsoutdoorclassi“cationofconsumerphotographsusinglow-levelandsemanticfeatures,in:IEEEInter-nationalConferenceonImageProcessing,vol.2,Thessaloniki,Greece,2001,pp.745…748.[24]S.Aksoy,K.Koperski,C.Tusk,G.Marchisio,J.C.Tilton,LearningBayesianclassi“ersforsceneclassi“cationwithavisualgrammar,IEEETransactionsonGeoscienceandRemoteSensing43(3)(2005)[25]C.Fredembach,M.Schroder,S.Susstrunk,Eigenregionsforimageclassi“cation,IEEETransactionsonPatternAnalysisandMachineIntelligence26(12)(2004)1645…1649.[26]A.Mojsilovic,J.Gomes,B.Rogowitz,Isee:Perceptualfeaturesforimagelibrarynavigationin:Proc.SPIEHumanvisionandelectronicimaging,vol.4662,SanJose,California,2002,pp.266…277.[27]J.Vogel,SemanticSceneModelingandRetrieval,no.33inSelectedReadingsinVisionandGraphics,HoughtonHartung-GorreVerlagKonstanz,2004.[28]J.Vogel,B.Schiele,Naturalsceneretrievalbasedonasemanticmodelingstep,in:InternationalConferenceonImageandVideoRetrieval,LNCS,vol.3115,Dublin,Ireland,2004,pp.207…215.[29]M.Varma,A.Zisserman,Textureclassi“cation:Are“lterbanksnecessary?,in:IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition,Vol.2,Madison,Wisconsin,2003,pp.691…698.[30]T.Leung,J.Malik,Representingandrecognizingthevisualappear-anceofmaterialsusingthree-dimensionaltextons,InternationalJournalofComputerVision43(1)(2001)29…44.[31]J.Portilla,E.Simoncelli,Aparametrictexturemodelbasedonjointstatisticsofcomplexwaveletcoecients,InternationalJournalofComputerVision40(1)(2000)49…70.[32]S.Lazebnik,C.Schmid,J.Ponce,Asparsetexturerepresentationusingane-invariantregions,in:IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition,vol.2,Madison,Wisconsin,2003,pp.319…324.[33]A.Bosch,A.Zisserman,X.Munoz,Sceneclassi“cationviaplsa,in:EuropeanConferenceonComputerVision,vol.4,Graz,Austria,2006,pp.517…530.[34]L.Fei-Fei,P.Perona,Abayesianhierarchicalmodelforlearningnaturalscenecategories,in:IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition,Washington,DC,USA,2005,pp.524…531.[35]P.Quelhas,F.Monay,J.Odobez,D.Gatica-Perez,T.Tuytelaars,L.VanGool,Modelingsceneswithlocaldescriptorsandlatentaspects,in:InternationalConferenceonComputerVision,Beijing,China,2005,pp.883…890.[36]S.Lazebnik,C.Schmid,J.Ponce,Beyondbagsoffeatures:Spatialpyramidmatchingforrecognizingnaturalscenecategories,in:IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition,NewYork,2006,toappear.[37]T.Hofmann,Unsupervisedlearningbyprobabilisticlatentsemanticanalysis,MachineLearning41(2)(2001)177…196.[38]D.Blei,A.Ng,M.Jordan,Latentdirichletallocation,JournalofMachineLearningResearch3(2003)993…1022.[39]Y.Teh,M.Jordan,M.Beal,D.Blei,Hierarchicaldirichletprocess,NeuralInformationProcessingSystems17(2005)1385…1392.[40]B.C.Russell,A.A.Efros,J.Sivic,W.T.Freeman,A.Zisserman,Usingmultiplesegmentationstodiscoverobjectsandtheirextentinimagecollections,in:IEEEComputerSocietyConferenceonCom-puterVisionandPatternRecognition,NewYork,2006,toappear.[41]F.Perronin,C.Dance,G.Csurka,M.Bressan,Adaptedvocabulariesforgenericvisualcategorization,in:EuropeanConferenceonComputerVision,vol.4,Graz,Austria,2006,pp.464…475.[42]A.Bosch,X.Munoz,J.Marti,Usingappearanceandcontextforoutdoorsceneobjectclassi“cation,in:IEEEInternationalConferenceonImageProcessing,vol.II,Genova,Italy,2005,pp.1218…1221.[43]R.Fergus,L.Fei-Fei,P.Perona,A.Zisserman,Learningobjectcategoriesfromgooglesimagesearch,in:InternationalConferenceonComputerVision,vol.II,Beijing,China,2005,pp.1816…1823.[44]A.Torralba,A.Oliva,Semanticorganizationofscenesusingdiscriminantstructuraltemplates,in:InternationalConferenceonComputerVision,Korfu,Greece,1999,pp.1253…1258.[45]A.Oliva,A.Torralba,Scene-centereddescriptionfromspatialenvelopeproperties,in:InternationalWorkshoponBiologicallyMotivatedComputerVision,LNCS,vol.2525,Tuebingen,Germany,2002,pp.263…272.[46]A.Bosch,X.Munoz,A.Oliver,R.Martš,Objectandsceneclassi“cation:whatdoesasupervisedapproachprovideus?,in:IAPRInternationalConferenceonPatternRecognition,HongKong,2006,toappear.[47]M.D.Squire,W.Muller,H.Muller,T.Pun,Content-basedqueryofimagedatabases:inspirationsfromtextretrieval,PatternRecognitionLetters21(2000)1193…1198.[48]J.Sivic,B.Russell,A.Efros,A.Zisserman,W.T.Freeman,Discoveringobjectsandtheirlocationsinimages,in:InternationalConferenceonComputerVision,Beijing,China,2005,pp.370…377.A.Boschetal./ImageandVisionComputingxxx(2006)xxx…xxx ARTICLEINPRESS Pleasecitethisarticleas:AnnaBosch,etal.,Areview:Whichisthebestwaytoorganize/classifyimagesbycontent?,ImageandVisionComputing(2006),doi:10.1016/j.imavis.2006.07.015