/
Figure1:Left:Ourdeformable3Dcuboidmodel.RightViewpointangle.pactvisua Figure1:Left:Ourdeformable3Dcuboidmodel.RightViewpointangle.pactvisua

Figure1:Left:Ourdeformable3Dcuboidmodel.RightViewpointangle.pactvisua - PDF document

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
362 views
Uploaded On 2016-03-10

Figure1:Left:Ourdeformable3Dcuboidmodel.RightViewpointangle.pactvisua - PPT Presentation

Detectorsperformance Layoutrescoring DPM13Ddetcombined DPM13Ddetcombined Hedauetal2 542513596 628 ours 556594605 600646638 Table1DetectionperformancemeasuredinAPat05 ID: 250047

Detectors'performance Layoutrescoring DPM[1]3Ddet.combined DPM[1]3Ddet.combined Hedauetal.[2] 54.2%51.3%59.6% --62.8% ours 55.6%59.4%60.5% 60.0%64.6%63.8% Table1:Detectionperformance(measuredinAPat0.5

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Figure1:Left:Ourdeformable3Dcuboidmodel...." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Figure1:Left:Ourdeformable3Dcuboidmodel.RightViewpointangle.pactvisualrepresentation.Unfortunately,theseapproachesworkwithweakerappearancemodelsthatcannotcompetewithcurrentdiscriminativeapproaches[1,6,13].Recently,Hedauetal.[2]proposedtoextendthe2DHOG-basedtemplatedetectorof[14]topredict3Dcuboids.However,sincethemodelrepresentsobject'sappearanceasarigidtemplatein3D,itsperformancehasbeenshowntobeinferiorto(2D)deformablepart-basedmodels(DPMs)[1].Incontrast,inthispaperweextendDPMtoreasonin3D.Ourmodelrepresentsanobjectclasswithadeformable3Dcuboidcomposedoffacesandparts,whicharebothallowedtodeformwithrespecttotheiranchorsonthe3Dbox(seeFig1).Towardsthisgoal,weintroducethenotionofstitchingpoint,whichenablesthedeformationbetweenthefacesandthecuboidtobeencodedefciently.Wemodeltheappearanceofeachfaceinfronto-parallelcoordinates,thuseffectivelyfactoringouttheappearancevariationduetoviewpoint.Wereasonaboutdifferentfacevisibilitypatternscalledaspects[15].Wetrainthecuboidmodeljointlyanddiscriminativelyandshareweightsacrossallaspectstoattainefciency.Ininference,ourmodeloutputs2Dalongwithoriented3Dboundingboxesaroundtheobjects.Thisenablestheestimationofobject'sviewpointwhichisacontinuousvariableinourrepresentation.Wedemonstratetheeffectivenessofourapproachinindoor[2]andoutdoorscenarios[16],andshowthatourapproachsignicantlyoutperformsthestate-of-the-artinboth2D[1]and3Dobjectdetection[2].2RelatedworkThemostcommonwaytotackle3Ddetectionistorepresenta3Dobjectbyacollectionofinde-pendent2Dappearancemodels[4,5,1,6,13],oneforeachviewpoint.Severalauthorsaugmentedthemulti-viewrepresentationwithweak3Dinformationbylinkingthefeaturesorpartsacrossviews[17,18,19,20,21].Thisallowsforadenserepresentationoftheviewingspherebymorphingrelatednear-byviews[12].Sincethesemethodsusuallyrequireasignicantamountoftrainingdata,renderingsofsyntheticCADmodelshavebeenusedtosupplementunder-representedviewsorprovidesupervisionfortrainingobjectpartsorobjectgeometry[22,13,8].Object-centeredapproaches,representobjectclasseswitha3Dmodeltypicallyequippedwithview-invariantgeometryandappearance[7,23,24,8,9,10,11,25].Whilethesetypesofmodelsareattractiveastheyenablecontinuousviewpointrepresentations,theirdetectionperformancehastyp-icallybeeninferiorto2Ddeformablemodels.Deformablepart-basedmodels(DPMs)[1]arenowadaysarguablythemostsuccessfulapproachtocategory-level2Ddetection.Towards3D,DPMshavebeenextendedtoreasonaboutobjectviewpointbytrainingthemixturemodelwithviewpointsupervision[6,13].Pepiketal.[13]tookastepfurtherbyincorporatingsupervisionalsoatthepartlevel.Consistencywasenforcedbyforcingthepartsfordifferent2Dviewpointmodelstobelongtothesamesetof3Dpartsinthephysicalspace.However,alltheseapproachesbasetheirrepresentationin2Dandthusoutputonly2Dboundingboxesalongwithadiscretizedviewpoint.Theclosestworktooursis[2],whichmodelsanobjectwitharigid3Dcuboid,composedofin-dependentlytrainedfaceswithoutdeformationsorparts.Ourmodelsharescertainsimilaritieswiththiswork,buthasasetofimportantdifferences.First,ourmodelishierarchicalanddeformable:weallowdeformationsofthefaces,whilethefacesthemselvesarecomposedofdeformableparts.Wealsoexplicitlyreasonaboutthevisibilitypatternsofthecuboidmodelandtrainthemodelac-cordingly.Furthermore,alltheparametersinourmodelaretrainedjointlyusingalatentSVMformulation.Thesedifferencesareimportant,asourapproachoutperforms[2]byasignicantmar-gin.2 Detectors'performance Layoutrescoring DPM[1]3Ddet.combined DPM[1]3Ddet.combined Hedauetal.[2] 54.2%51.3%59.6% --62.8% ours 55.6%59.4%60.5% 60.0%64.6%63.8% Table1:Detectionperformance(measuredinAPat0.5IOUoverlap)forthebeddatasetof[2] 3Dmeasure DPMt3DBBOX3DcombinedBBOX3D+layoutcomb.+layout convexhull 48.2%53.9%53.9%57.8%57.1% faceoverlap 16.3%33.0%34.4%33.5%33.6% Table2:3DdetectionperformanceinAP(50%IOUoverlapofconvexhullsandfaces) Figure5:Precision-recallcurvesfor(left)2Ddetection(middle)convexhull,(right)faceoverlap.Learning:GivenasetoftrainingsamplesD=(hx1;y1;bb1i;hxN;yN;bbNi),wherexisanimage,yi2f�1;1g,andbb2R82aretheeightcoordinatesofthe3Dboundingboxintheimage,ourgoalistolearntheweightsw=[wa1;;waP]forallPaspectsinEq.(5).Totrainourmodelusingpartiallylabeleddata,weusealatentSVMformulation[1],however,frameworkssuchaslatentstructuralSVMs[27]arealsopossible.Toinitializethefullmodel,werstlearnadeformableface+partsmodelforeachfaceindependently,wherethefacesofthetrainingexamplesarerectiedtobefrontalpriortotraining.Weestimatethedifferentaspectsofour3Dmodelfromthestatisticsofthetrainingdata,andcomputeforeachtrainingcuboidtherelativepositionsva;ioffaceiandthestitchingpointintherectiedviewofeachface.Wethenperformjointtrainingofthefullmodel,treatingthetrainingcuboidandthestitchingpointaslatent,however,requiringthateachfacelterandthefaceannotationoverlapmorethan70%.Following[1],weutilizeastochasticgradientdescentapproachwhichalternatesbetweensolvingforthelatentvariablesandupdatingtheweightsw.Notethatthisalgorithmisonlyguaranteedtoconvergetoalocaloptimum,asthelatentvariablesmaketheproblemnon-convex.4ExperimentsWeevaluateourapproachontwodatasets,thedatasetof[2]aswellasKITTI[16],anautonomousdrivingdataset.Toourknowledge,thesearetheonlydatasetswhichhavebeenlabeledwith3Dboundingboxes.Webeginourexperimentationwiththeindoorscenario[2].Thebedroomdatasetcontains181trainand128testimages.ToenableacomparisonwiththeDPMdetector[1],wetrainedamodelwith6mixturesand8partsusingthesametraininginstancesbutemploying2Dboundingboxes.Our3Dbedmodelwastrainedwithtwopartsperface.Fig.3showsthestatisticsofthedatasetintermsofthenumberoftrainingexamplesforeachaspect(whereL-R-Tdenotesanaspectforwhichthefront,rightandthetopfacearevisible),aswellasperface.Notethatthefactthatthedatasetisunbalanced(fewerexamplesforaspectswithtwofaces)doesnotaffecttoomuchourapproach,asonlytheface-stitchingpointdeformationparametersareaspectspecic.Aswesharetheweightsamongtheaspects,thenumberoftraininginstancesforeachfaceissignicantlyhigher(Fig.3,middle).WecomparethistoDPMinFig.3,right.Ourmethodcanbetterexploitthetrainingdatabyfactoringouttheviewpointdependanceofthetrainingexamples.Webeginourquantitativeevaluationbyusingourmodeltoreasonabout2Ddetection.The2Dboundingboxesforourmodelarecomputedbyttinga2Dboxaroundtheconvexhulloftheprojectionofthepredicted3Dbox.Wereportaverageprecision(AP)wherewerequirethattheoutput2Dboxesoverlapwiththeground-truthboxesatleast50%usingtheintersection-over-union(IOU)criteria.Theprecision-recallcurvesareshowninFig.5.Wecompareourapproachtothedeformablepartmodel(DPM)[1]andthecuboidmodelofHedauetal.[2].AsshowninTable1weoutperformthecuboidmodelof[2]by8:1%andDPMby3:8%.Thisisnotable,astothebest6 Figure8:KITTI:examplesofcardetections.(top)Groundtruth,(bottom)Our3Ddetections,augmentedwithbestttingCADmodelstovisualizeinferred3Dboxorientations.Sinceourmodelalsopredictsthelocationsofthedominantobjectfaces(andthusthe3Dobjectorientation),wewouldliketoquantifyitsaccuracy.Weintroduceanevenstrictermeasurewherewerequirethatalsothepredictedcuboidfacesoverlapwiththefacesoftheground-truthcuboids.Inparticular,ahypothesisiscorrectiftheaverageoftheoverlapsbetweentopfacesandverticalfacesexceeds50%IOU.WecomparetheresultsofourapproachtoDPM[1].Notehowever,that[1]returnsonly2Dboxesandhenceadirectcomparisonisnotpossible.WethusaugmenttheoriginalDPMwith3Dinformationinthefollowingway.Sincethethreedominantorientationsoftheroom,andthustheobjects,areknown(estimatedviathevanishingpoints),wecannda3Dboxwhoseprojectionbestoverlapswiththeoutputofthe2Ddetector.Thiscanbedonebyslidingacuboid(whosedimensionsmatchourcuboidmodel)in3Dtobesttthe2Dboundingbox.Ourapproachoutperformsthe3DaugmentedDPMbyasignicantmarginof16:7%.Weattributethistothefactthatourcuboidisdeformableandthusthefaceslocalizemoreaccuratelyonthefacesoftheobject.WealsoconductedpreliminarytestsforourmodelontheautonomousdrivingdatasetKITTI[16].Wetrainedourmodelwith8aspects(estimatedfromthedata)and4partsperface.AnexampleofalearnedaspectmodelisshowninFig.4.Notethattherectangularpatchesonthefacesrepresenttheparts,andcolorcodingisusedtodepictthelearnedpartandfacedeformationweights.Wecanob-servethatthemodeleffectivelyandcompactlyfactorsouttheappearancechangesduetochangesinviewpoint.ExamplesofdetectionsareshowninFig.8.Thetoprowsshowgroundtruthannotations,whilethebottomrowsdepictourpredicted3Dboxes.ToshowcasealsotheviewpointpredictionofourdetectorweinsertaCADmodelinsideeachestimated3Dbox,matchingitsorientationin3D.Inparticular,foreachdetectionweautomaticallychoseaCADmodeloutofacollectionof80modelswhose3Dboundingboxbestmatchesthedimensionsofthepredictedbox.Onecanseethatour3Ddetectorisabletopredicttheviewpointsoftheobjectswell,aswellasthetypeofcar.5ConclusionWeproposedanovelapproachto3Dobjectdetection,whichextendsthewell-acclaimedDPMtoreasonin3Dbymeansofadeformable3Dcuboid.Ourcuboidallowsfordeformationsatthefacelevelviaastitchingpointaswellasdeformationsbetweenthefacesandtheparts.Wedemonstratedtheeffectivenessofourapproachinindoorandoutdoorscenariosandshowedthatourapproachoutperforms[1]and[2]intermsof2Dand3Destimation.Infuturework,weplantoreasonjointlyaboutthe3Dscenelayoutandtheobjectsinordertoimprovetheperformanceinbothtasks.Acknowledgements.S.F.hasbeensupportedinpartbyDARPA,contractnumberW911NF-10-2-0060.Theviewsandconclusionscontainedinthisdocumentarethoseoftheauthorsandshouldnotbeinterpretedasrepresentingtheofcialpolicies,eitherexpressorimplied,oftheArmyResearchLaboratoryortheU.S.Government.8