/
Here,Pnisthe2Dprojectionofthe3DshapeSnwithwhitenoiseNnandtherigidtrans Here,Pnisthe2Dprojectionofthe3DshapeSnwithwhitenoiseNnandtherigidtrans

Here,Pnisthe2Dprojectionofthe3DshapeSnwithwhitenoiseNnandtherigidtrans - PDF document

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
361 views
Uploaded On 2016-07-20

Here,Pnisthe2Dprojectionofthe3DshapeSnwithwhitenoiseNnandtherigidtrans - PPT Presentation

Figure3NRSfMviewpointestimationEstimatedviewpointsvisualizedusinga3Dcarwireframe223DBasisShapeModelLearningEquippedwithcameraprojectionparametersandkeypointcorrespondencesliftedto3DbyNRSfMont ID: 411883

Figure3:NRSfMviewpointestimation:Estimatedview-pointsvisualizedusinga3Dcarwireframe.2.2.3DBasisShapeModelLearningEquippedwithcameraprojectionparametersandkey-pointcorrespondences(liftedto3DbyNRSfM)ont

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Here,Pnisthe2Dprojectionofthe3DshapeSnwi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Here,Pnisthe2Dprojectionofthe3DshapeSnwithwhitenoiseNnandtherigidtransformationgivenbytheortho-graphicprojectionmatrixRn,scalecnand2DtranslationTn.TheshapeisparameterizedasafactoredGaussianwithameanshapeS,mbasisvectors[V1;V2;;Vm]=Vandlatentdeformationparameterszn.Ourkeymodicationisconstraint(2)whereCmaskndenotestheChamferdistanceeldofthenthinstance'sbinarymaskandsaysthatallkey-pointspk;nofinstancenshouldlieinsideitsbinarymask.Weobservedthatthisresultsinmoreaccurateviewpointsaswellasmoremeaningfulshapebaseslearntfromthedata.Learning.Thelikelihoodoftheabovemodelismaxi-mizedusingtheEMalgorithm.Missingdata(occludedkeypoints)isdealtwithby“lling-in”thevaluesusingtheforwardequationsaftertheE-step.Thealgorithmcom-putesshapeparametersfS;Vg,rigidbodytransformationsfcn;Rn;Tngaswellasthedeformationparametersfzngforeachtraininginstancen.Inpractice,weaugmentthedatausinghorizontallymirroredimagestoexploitbilateralsymmetryintheobjectclassesconsidered.Wealsoprecom-putetheChamferdistanceeldsforthewholesettospeedupcomputation.AsshowninFigure3,NRSfMallowsustoreliablypredictviewpointwhilebeingrobusttointraclassvariations. Figure3:NRSfMviewpointestimation:Estimatedview-pointsvisualizedusinga3Dcarwireframe.2.2.3DBasisShapeModelLearningEquippedwithcameraprojectionparametersandkey-pointcorrespondences(liftedto3DbyNRSfM)onthewholetrainingset,weproceedtobuilddeformable3Dshapemodelsfromobjectsilhouetteswithinaclass.3Dshapereconstructionfrommultiplesilhouettesprojectedfromasingleobjectincalibratedsettingshasbeenwidelystudied.Twoprominentapproachesarevisualhulls[24]andvariationalmethodsderivedfromsnakese.g[14,30]whichdeformasurfacemeshiterativelyuntilconvergence.Someinterestingrecentpapershaveextendedvariationalapproachestohandlecategories[12,13]buttypicallyre-quiresomeformof3Dannotationstobootstrapmodels.Arecentlyproposedvisual-hullbasedapproach[36]requiresonly2Dannotationsaswedoforclass-basedreconstructionanditwassuccessfullydemonstratedonPASCALVOCbutdoesnotserveourpurposesasitmakesstrongassumptionsabouttheaccuracyofthesegmentationandwillinfactllentirelyanysegmentationwithavoxellayer.ShapeModelFormulation.Wemodelourcategoryshapesasdeformablepointclouds–oneforeachsubcategoryoftheclass.Theunderlyingintuitionisthefollowing:sometypesofshapevariationmaybewellexplainedbyapara-metricmodele.g.aToyotasedanandaLexussedan,butitisunreasonabletoexpectthemtomodelthevariationsbe-tweensailboatsandcruiseliners.Suchmodelstypicallyre-quireknowledgeofobjectparts,theirspatialarrangementsetc.[22]andinvolvecomplicatedformulationsthataredif-culttooptimize.Weinsteadtrainseparatelinearshapemodelsfordifferentsubcategoriesofaclass.AsintheNRSfMmodel,weusealinearcombinationofbasestomodelthesedeformations.Notethatwelearnsuchmod-elsfromsilhouettesandthisiswhatenablesustolearnde-formablemodelswithoutrelyingonpointcorrespondencesbetweenscanned3Dexemplars[8].OurshapemodelM=( S;V)comprisesofameanshape SanddeformationbasesV=fV1;:;VKglearntfromatrainingsetT:f(Oi;Pi)gNi=1,whereOiistheinstancesilhouetteandPiistheprojectionfunctionfromworldtoimagecoordinates.NotethatthePiweobtainus-ingNRSfMcorrespondstoorthographicprojectionbutouralgorithmcouldhandleperspectiveprojectionaswell.EnergyFormulation.Weformulateourobjectivefunc-tionprimarilybasedonimagesilhouettes.Forexample,theshapeforaninstanceshouldalwaysprojectwithinitssil-houetteandshouldagreewiththekeypoints(liftedto3DbyNRSfM).Wecapturethesebydeningcorrespondingenergytermsasfollows:(hereP(S)correspondstothe2DprojectionofshapeS,CmaskreferstotheChamferdis-tanceeldofthebinarymaskofsilhouetteOandk(p;Q)isdenedasthesquaredaveragedistanceofpointptoitsknearestneighborsinsetQ)SilhouetteConsistency.Silhouetteconsistencysimplyen-forcesthepredictedshapeforaninstancetoprojectinsideitssilhouette.Thiscanbeachievedbypenalizingthepointsprojectedoutsidetheinstancemaskbytheirdistancefromthesilhouette.Inournotationitcanbewrittenasfollows:Es(S;O;P)=XCmask(p)�01(p;O)(3)SilhouetteCoverage.Usingsilhouetteconsistencyalone Figure4:MeanshapeslearntforrigidclassesinPASCALVOCobtainedusingourbasisshapeformulation.Coloren-codesdepthwhenviewedfrontally.accordingly.Finally,themeanshapeisrotatedasperthepredictedviewpointandtranslatedtothecenterofthepre-dictedboundingbox.ShapeInference.Afterinitialization,wesolveforthede-formationweights (initializedto0)aswellasallthecam-eraprojectionparameters(scale,translationandrotation)byoptimizingequation(9)forxedS;V.Notethatwedonothaveaccesstoannotatedkeypointlocationsattesttime,the`KeypointConsistency'energyEkpisignoredduringtheoptimization.Bottom-upShapeRenement.Theaboveoptimiza-tionresultsinatop-down3Dreconstructionbasedonthecategory-levelmodels,inferredobjectsilhouette,viewpointandourshapepriors.Weproposeanadditionalprocess-ingsteptorecoverhighfrequencyshapeinformationbyadaptingtheintrinsicimagesalgorithmofBarronandMalik[5,4],SIRFS,whichexploitsstatisticalregularitiesbetweenshapes,reectanceandilluminationFormally,SIRFSisfor-mulatedasthefollowingoptimizationproblem:minimizeZ;Lg(I�S(Z;L))+f(Z)+h(L)whereR=I�S(Z;L)isalog-reectanceimage,ZisadepthmapandLisaspherical-harmonicmodelofillu-mination.S(Z;L)isarenderingenginewhichproducesalogshadingimagewiththeilluminationL.g,fandharethelossfunctionscorrespondingtoreectance,shapeandilluminationrespectively.WeincorporateourcurrentcoarseestimateofshapeintoSIRFSthroughanadditionallossterm:fo(Z;Z0)=Xi((Zi�Z0i)2+2) owhereZ0istheinitialcoarseshapeandaparameteraddedtomakethelossdifferentiableeverywhere.WeobtainZ0foranobjectbyrenderingadepthmapofourtted3Dshapemodelwhichguidestheoptimizationofthishighlynon-convexcostfunction.Theoutputsfromthisbottom-uprenementarereectance,shapeandilluminationmapsofwhichweretaintheshape.ImplementationDetails.Thegradientsinvolvedinouroptimizationforshapeandprojectionparametersareex-tremelyefcienttocompute.Weuseapproximatenearestneighborscomputedusingk-dtreetoimplementthe`Sil-houetteCoverage'gradientsandleverageChamferdistanceeldsforobtaining`SilhouetteConsistency'gradients.Ouroverallcomputationtakesonlyabout2sectoreconstructanovelinstanceusingasingleCPUcore.Ourtrainingpipelineisalsoequallyefcient-takingonlyafewmin-utestolearnashapemodelforagivenobjectcategory.4.ExperimentsExperimentswereperformedtoassesstwothings:1)howexpressiveourlearned3Dmodelsarebyevaluatinghowwelltheymatchedtheunderlying3Dshapesofthetrainingdata2)studytheirsensitivitywhenttoimagesusingnoisyautomaticsegmentationsandposepredictions.Datasets.Forallourexperiments,weconsiderimagesfromthechallengingPASCALVOC2012dataset[15]whichcontainobjectsfromthe10rigidobjectcategories(aslistedinTable1).Weusethepubliclyavailablegroundtruthclass-specickeypoints[9]andobjectsegmentations[19].Sincegroundtruth3DshapesareunavailableforPASCALVOCandmostotherdetectiondatasets,weevaluatedtheexpressivenessofourlearned3Dmodelsonthenextbestthingwemanagedtoobtain:thePASCAL3D+dataset[39]whichhasupto103DCADmodelsfortherigidcategoriesinPASCALVOC.PASCAL3D+providesbetween4dif-ferentmodelsfor“tvmonitor”and“train”and10for“car”and“chair”.Thedifferentmeshesprimarilydistinguishbe-tweensubcategoriesbutmayalsoberedundant(e.g.,therearemorethan3meshesforsedansin“car”).Weobtainoursubcategorylabelsonthetrainingdatabymergingsomeofthesecases,whichalsohelpsusintacklingdatasparsityforsomesubcategories.ThesubsetofPASCALweconsideredafterlteringoccludedinstances,whichwedonottackleinthispaper,hadbetween70imagesfor“sofa”and500im-agesforclasses“aeroplanes”and“cars”.Wewillmakeallourimagesetsavailablealongwithourimplementation.Metrics.Wequantifythequalityofour3Dmodelsbycom-paringagainstthePASCAL3D+modelsusingtwometrics Classesaerobikeboatbuscarchairmbikesofatraintv mean Mesh KP+Mask 5.006.279.946.225.185.204.986.5812.609.64 7.16 Carvi[36] 5.076.038.808.764.385.744.866.4917.528.37 7.60 Puffball[35] 9.7310.3911.6815.4011.778.588.998.6223.689.45 11.83 Depth KP+Mask 9.257.8712.3611.777.227.518.979.7030.916.84 11.24 Carvi[36] 9.397.2411.4318.426.867.398.0612.2129.575.75 11.63 SIRFS[4] 12.9812.3116.0329.2121.5815.5316.3018.0838.5421.36 20.19 Table1:Studyingtheexpressivenessofourlearnt3Dmodels:comparisonbetweenourmethodand[36,35]usinggroundtruthkeypointsandmasksonPASCALVOC.Notethat[36]operateswithgroundtruthannotationsandreconstructsanimagecorpusandourmethodisusedhereonthesametaskforafaircomparison.Pleaseseetextformoredetails. Classes aerobikeboatbuscarchairmbikesofatraintv mean Mesh KP+Mask 5.136.4610.465.895.075.345.1515.0712.1611.69 8.24 KP+SDS 4.966.5810.584.674.975.405.2115.0812.7812.18 8.24 PP+SDS 6.5814.0214.436.657.967.477.5715.2115.2313.24 10.84 Puffball[35](SDS) 9.6810.2311.8015.9512.428.289.459.6023.389.26 12.00 Depth KP+Mask 9.027.2613.5112.108.048.0210.0023.0525.577.48 12.41 KP+SDS 9.077.9813.579.907.987.969.9922.5723.597.64 12.03 PP+SDS 10.9411.6412.2615.9513.1710.0612.5521.1936.378.98 15.31 SIRFS[4] 11.8011.8315.9829.1521.6415.5816.9119.6437.5823.01 20.31 Table2:Ablationstudyforourmethodassuming/relaxingvariousannotationsattesttimeonobjectsinPASCALVOC.Ascanbeseen,ourmethoddegradesgracefullywithrelaxedannotations.Notethattheseexperimentsareinatrain/testsettingandnumberswilldifferfromtable1.Pleaseseetextformoredetails.-1)theHausdorffdistancenormalizedbythe3Dbound-ingboxsizeofthegroundtruthmodel[3]and2)adepthmaperrortoevaluatethequalityofthereconstructedvisi-bleobjectsurface,measuredasthemeanabsolutedistancebetweenreconstructedandgroundtruthdepth:Z-MAE(^Z;Z)=1 n min Xx;yj^Zx;y�Zx;y� j(10)where^ZandZrepresentpredictedandgroundtruthdepthmapsrespectively.Analytically, canbecomputedasthemedianof^Z�Zand isanormalizationfactortoaccountforabsoluteobjectsizeforwhichweusetheboundingboxdiagonal.Notethatourdepthmaperroristranslationandscaleinvariant.4.1.ExpressivenessofLearned3DModelsWelearnandtour3Dmodelsonthesamewholedataset(notrain/testsplit),followingthesetupofVicenteetal[36].Table1comparesourreconstructionsonPASCALVOCwiththoseofthisrecentlyproposedmethodwhichisspecializedforthistask(e.g.itisnotdesignedforttingtonoisydata),aswellastoastateoftheartclass-agnosticshapeinationmethodthatreconstructsalsofromasin-glesilhouette.Wedemonstratecompetitiveperformanceonbothbenchmarkswithourmodelsshowinggreaterro-bustnestoperspectiveforeshorteningeffectson“trains”and“buses”.Category-agnosticmethods–Puffball[35]andSIRFS[4]–consistentlyperformworseonthebenchmarkbythemselves.Certainclasseslike“boat”and“tvmonitor”areespeciallyhardbecauseoflargeintraclassvarianceanddatasparsityrespectively.4.2.SensitivityAnalysisInordertoanalyzesensitivityofourmodelstonoisyinputswereconstructedheld-outtestinstancesusingourmodelsgivenjustgroundtruthboundingboxes.Wecomparevariousversionsofourmethodus-inggroundtruth(Mask)/imperfectsegmentations(SDS)andkeypoints(KP)/ourposepredictor(PP)forviewpointestima-tionrespectively.Forposeprediction,weusetheCNN-basedsystemof[34]andaugmentittopredictsubtypesattesttime.Thisisachievedbytrainingthesystemasde-scribedin[34]withadditionalsubcategorylabelsobtainedfromPASCAL3D+asdescribedabove.Toobtainanap-proximatesegmentationfromtheboundingbox,weusetherenementstageofthestate-of-the-artjointdetectionandsegmentationsystemproposedin[20].Here,weuseatrain/testsettingwhereourmodelsaretrainedononlyasubsetofthedataandusedtoreconstructtheheldoutdatafromboundingboxes.Table2showsthatourresultsdegradegracefullyfromthefullyannotatedtothefullyautomaticsetting.Ourmethodisrobusttosome Figure5:Fullyautomaticreconstructionsondetectedinstances(0.5IoUwithgroundtruth)usingourmodelsonrigidcategoriesinPASCALVOC.Weshowourinstancesegmentationinput,theinferredshapeoverlaidontheimage,a2.5Ddepthmap(afterthebottom-uprenementstage),themeshintheimageviewpointandtwootherviews.Itcanbeseenthatourmethodproducesplausiblereconstructionswhichisaremarkableachievementgivenjustasingleimageandnoisyinstancesegmentations.Colorencodesdepthintheimageco-ordinateframe(blueiscloser).Moreresultscanbefoundathttp://goo.gl/lmALxQ. [14]C.H.EstebanandF.Schmitt.Silhouetteandstereofu-sionfor3dobjectmodeling.Comput.Vis.ImageUnderst.,96(3):367–392,Dec.2004.3,4[15]M.Everingham,L.VanGool,C.K.I.Williams,J.Winn,andA.Zisserman.ThePASCALVisualObjectClassesChallenge2012(VOC2012)Results.http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.1,5[16]R.Garg,A.Roussos,andL.Agapito.Densevariationalre-constructionofnon-rigidsurfacesfrommonocularvideo.InCVPR,June2013.2[17]R.Girshick,J.Donahue,T.Darrell,andJ.Malik.Richfea-turehierarchiesforaccurateobjectdetectionandsemanticsegmentation.InCVPR,2014.1[18]A.Gupta,A.A.Efros,andM.Hebert.Blocksworldre-visited:Imageunderstandingusingqualitativegeometryandmechanics.InComputerVision–ECCV2010,pages482–496.Springer,2010.2[19]B.Hariharan,P.Arbelaez,L.Bourdev,S.Maji,andJ.Malik.Semanticcontoursfrominversedetectors.InICCV,2011.5[20]B.Hariharan,P.ArbelĀ“aez,R.Girshick,andJ.Malik.Simul-taneousdetectionandsegmentation.InEuropeanConfer-enceonComputerVision(ECCV),2014.1,4,6,7[21]M.HejratiandD.Ramanan.Analyzing3dobjectsinclut-teredimages.InNIPS,pages602–610,2012.2[22]E.Kalogerakis,S.Chaudhuri,D.Koller,andV.Koltun.AProbabilisticModelofComponent-BasedShapeSynthesis.ACMTransactionsonGraphics,31(4),2012.3[23]I.Kemelmacher-Shlizerman.Internetbasedmorphablemodel.InInternationalConferenceonComputerVision(ICCV),2011.1[24]A.Laurentini.Thevisualhullconceptforsilhouette-basedimageunderstanding.PatternAnalysisandMachineIntelli-gence,IEEETransactionson,16(2):150–162,Feb1994.3[25]J.J.Lim,H.Pirsiavash,andA.Torralba.Parsingikeaob-jects:Fineposeestimation.InICCV,2013.2[26]C.Nandakumar,A.Torralba,andJ.Malik.Howlittledoweneedfor3-dshapeperception?Perception-London,40(3):257,2011.1[27]R.NevatiaandT.O.Binford.Descriptionandrecognitionofcurvedobjects.ArticialIntelligence,8(1):77–98,1977.2[28]M.Prasad,A.Fitzgibbon,A.Zisserman,andL.VanGool.Findingnemo:Deformableobjectclassmodellingusingcurvematching.InCVPR,2010.2[29]L.G.Roberts.MachinePerceptionofThree-DimensionalSolids.PhDthesis,MassachusettsInstituteofTechnology,1963.2[30]Y.SahillioluandY.Yemez.Asurfacedeformationframe-workfor3dshaperecovery.InMultimediaContentRepre-sentation,ClassicationandSecurity,volume4105ofLec-tureNotesinComputerScience,pages570–577.SpringerBerlinHeidelberg,2006.3[31]S.Satkin,M.Rashid,J.Lin,andM.Hebert.3dnn:3dnearestneighbor.InternationalJournalofComputerVision,pages1–29,2014.2[32]S.Suwajanakorn,I.Kemelmacher-Shlizerman,andS.Seitz.Totalmovingfacereconstruction.InD.Fleet,T.Pajdla,B.Schiele,andT.Tuytelaars,editors,ComputerVisionECCV2014,volume8692ofLectureNotesinComputerScience,pages796–812.SpringerInternationalPublishing,2014.1[33]L.Torresani,A.Hertzmann,andC.Bregler.Non-rigidstructure-from-motion:Estimatingshapeandmotionwithhierarchicalpriors.TPAMI,2008.2[34]S.TulsianiandJ.Malik.Viewpointsandkeypoints.InCVPR.2015.1,4,6[35]N.R.Twarog,M.F.Tappen,andE.H.Adelson.Playingwithpuffball:simplescale-invariantinationforuseinvisionandgraphics.InACMSymp.onAppliedPerception,2012.6[36]S.Vicente,J.Carreira,L.Agapito,andJ.Batista.Recon-structingpascalvoc.CVPR2014,2014.1,2,3,6[37]S.VicenteandL.deAgapito.Balloonshapes:Reconstruct-inganddeformingobjectswithvolumefromimages.In3DV,pages223–230.IEEE,2013.2[38]Z.Wu,S.Song,A.Khosla,F.Yu,L.Zhang,X.Tang,andJ.Xiao.3dshapenets:Adeeprepresentationforvolumetricshapemodeling.InCVPR.2015.7[39]Y.Xiang,R.Mottaghi,andS.Savarese.Beyondpascal:Abenchmarkfor3dobjectdetectioninthewild.InWACV,2014.5,7[40]J.Xiao,B.Russell,andA.Torralba.Localizing3dcuboidsinsingle-viewimages.InAdvancesinNeuralInformationProcessingSystems,pages746–754,2012.2[41]S.Zhu,L.Zhang,andB.Smith.Modelevolution:Anin-crementalapproachtonon-rigidstructurefrommotion.InCVPR,2010.2[42]M.Z.Zia,M.Stark,B.Schiele,andK.Schindler.Detailed3drepresentationsforobjectrecognitionandmodeling.Pat-ternAnalysisandMachineIntelligence,IEEETransactionson,35(11):2608–2623,2013.2

Related Contents


Next Show more