/
Blocks World Revisited Image Understanding Using Qualitative Geometry and Mechanics Abhinav Blocks World Revisited Image Understanding Using Qualitative Geometry and Mechanics Abhinav

Blocks World Revisited Image Understanding Using Qualitative Geometry and Mechanics Abhinav - PDF document

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
451 views
Uploaded On 2014-12-15

Blocks World Revisited Image Understanding Using Qualitative Geometry and Mechanics Abhinav - PPT Presentation

Efros and Martial Hebert Robotics Institute Carnegie Mellon University Abstract Since most current scene understanding approaches operate either on the 2D image or using a surfacebased representation they do not allow reasoning about the physical co ID: 24499

Efros and Martial Hebert

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Blocks World Revisited Image Understandi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

2Guptaet.al Fig.1.Exampleoutputofourautomaticsceneunderstandingsystem.The3Dparsegraphsummarizestheinferredobjectproperties(physicalboundaries,geometrictype,andmechanicalproperties)andrelationshipsbetweenobjectswithinthescene.Seemoreexamplesonprojectwebpage.impossibleorhighlyunlikely.Second,evenifsuccessful,thesepop-upmodels(alsoknownas\billboards"ingraphics)lackthephysicalsubstanceofatrue3Drepresentation.LikeinaPotemkinvillage,thereisnothingbehindtheprettyfacades!Thispaperarguesthatamorephysicalrepresentationofthescene,whereobjectshavevolumeandmass,canprovidecrucialhigh-levelconstraintstohelpconstructaglobally-consistentmodelofthescene,aswellasallowforpowerfulwaysofunderstandingandinterpretingtheunderlyingimage.Thesenewcon-straintscomeintheformofgeometricrelationshipsbetween3Dvolumesaswellaslawsofstaticsgoverningthebehaviorofforcesandtorques.Ourmaininsightisthattheproblemcanbeframedqualitatively,withoutrequiringametricre-constructionofthe3Dscenestructure(whichis,ofcourse,impossiblefromasingleimage).Figure1showsarealoutputfromourfully-automaticsystem.Thepaper'smaincontributionsare:(a)anovelqualitativescenerepresen-tationbasedonvolumes(blocks)drawnfromasmalllibrary;(b)theuseof3Dgeometryandmechanicalconstraintsforreasoningaboutscenestructure;(c)aniterativeInterpretation-by-Synthesisframeworkthat,startingfromtheemptygroundplane,progressively\buildsup"aconsistentandcoherentinterpretation BlocksWorldRevisited3oftheimage;(d)atop-downsegmentationadjustmentprocedurewherepartialsceneinterpretationsguidethecreationofnewsegmentproposals.RelatedWork:Theideathatthebasicphysicalandgeometricconstraintsofourworld(so-calledlawsofnature)playacrucialroleinvisualperceptiongoesbackatleasttoHelmholtzandhisargumentfor\unconsciousinference".Incomputervision,thisthemecanbetracedbacktotheverybeginningsofourdiscipline,withLarryRobertsarguingin1965that\theperceptionofsolidobjectsisaprocesswhichcanbebasedonthepropertiesofthree-dimensionaltransformationsandthelawsofnature"[17].Roberts'famousBlocksWorldwasadaringearlyattemptatproducingacompletesceneunderstandingsystemforaclosedarti cialworldoftexturelesspolyhedralshapesbyusingagenericlibraryofpolyhedralblockcomponents.Atthesametime,researchersinroboticsalsorealizedtheimportanceofphysicalstabilityofblockassembliessincemanyblockcon gurations,whilegeometricallypossible,werenotphysicallystable.Theyshowedhowtogenerateplansforthemanipulationstepsrequiredtogofromaninitialcon gurationtoatargetcon gurationsuchthatatanystageofassemblytheblocksworldremainedstable[1].Finally,theMITCopyDemo[21]combinedthetwoe orts,demonstratingarobotthatcouldvisuallyobserveablocksworldcon gurationandthenrecreateitfromapileofunorderedblocks(recently[2]gaveamoresophisticatedreinterpretationofthisidea,butstillinahighlyconstrainedenvironment).Unfortunately,hopesthattheinsightslearnedfromtheblocksworldwouldcarryoverintotherealworlddidnotmaterializeasitbecameapparentthatalgo-rithmsweretoodependentonitsveryrestrictiveassumptions(perfectboundarydetection,texturelesssurfaces,etc).Whiletheideaofusing3Dgeometricprimi-tivesforunderstandingrealscenescarriedonintotheworkongeneralizedcylin-dersandresultedinsomeimpressivedemosinthe1980s(e.g.,ACRONYM[16]),iteventuallygavewaytothecurrentlydominantappearance-based,semanticla-belingmethods,e.g.,[19,5].Ofthese,themostambitiousisthee ortofS.C.Zhuandcolleagues[23]whouseahand-craftedstochasticgrammaroverahighlyde-taileddatasetoflabelledobjectsandpartstohierarchicallyparseanimage.Whiletheyshowimpressiveresultsforafewspeci cscenetypes(e.g.,kitchens,corridors)theapproachisyettobedemonstratedonmoregeneraldata.Mostrelatedtoourworkisarecentseriesofmethodsthatattempttomodelgeometricscenestructurefromasingleimage:inferringqualitativegeometryofsurfaces[8,18], ndingground/vertical\fold"lines[3],groupinglinesintosurfaces[22,13],estimatingocclusionboundaries[10],andcombininggeometricandsemanticinformation[9,14,4].However,theseapproachesdonotmodeltheglobalinteractionsbetweenthegeometricentitieswithinthescene,andattemptstoincorporatethematthe2Dimagelabelinglevel[15,12]havebeenonlypartiallysuccessful.Whilesinglevolumeshavebeenusedtomodelsimplebuildinginteriors[6]andobjects(suchasbed)[7],theseapproachesdonotmodelgeometricormechanicalinter-volumerelationships.Andwhilemodelingphysicalconstraintshasbeenusedinthecontextofdynamicobjectrelationshipsinvideo[20],wearenotawareofanyworkusingthemtoanalyzestaticimages. BlocksWorldRevisited9 Fig.6.Ourapproachforevaluatingblockhypothesisandestimatingtheassociatedcostofplacingablock.estimationalgorithmforsuperpixels.Forexample,iftheblockisassociatedwith\front-right"viewclassandthesuperpixelisontherightofthefoldingedge,thenP(gsjGi;fi;s)wouldbetheprobabilityofthesuperpixelbeinglabeledright-facingbythesurfacelayoutestimationalgorithm.Forestimatingthecontactpointslikelihoodterm,weusetheconstraintsofperspectiveprojection.Giventheblockgeometryandthefoldingedge,we tstraightlineslgandlstothethegroundandskycontactpoints,respectively,andweverifyiftheirslopesareinagreementwiththesurfacegeometry:forafrontalsurface,lgandlsshouldbehorizontal,andforleft-andright-facingsurfaceslgandlsshouldintersectonthehorizonline.3.5EstimatingPhysicalStabilityOurstabilitymeasure(Figure6d)consistsofthreeterms.(1)InternalSta-bility:Wepreferblockswithlowpotentialenergies,thatis,blockswhichhaveheavierbottomandlightertop.Thisisusefulforrejectingsegmentationswhichmergetwosegmentswithdi erentdensities,suchasthelighterobjectbelowtheheavierobjectshownonFigure4(c).Forcomputinginternalstability,werotatetheblockbyasmallangle,,(clockwiseandanti-clockwise)aroundthecenterofeachface;andcomputethechangeinpotentialenergyoftheblockas:Pi=Xc2flight;medium;heavygXs2Sip(ds=c)mchs;(4)wherep(ds=c)istheprobabilityofassigningdensityclassctosuperpixels,hsisthechangeinheightduetotherotationandmcisaconstantrepresentingthedensityclass.Thechangeinpotentialenergyisafunctionofthreeconstants. 10Guptaet.al Fig.7.(a)Computationoftorquesaroundcontactlines.(b)Extracteddepthcon-straintsarebasedonconvexityandsupportrelationshipsamongblocks.Usingconstraintssuchashm=mheavy mmedium�1andlm=mlight mmedium1,wecomputetheexpectedvalueofPiwithrespecttotheratioofdensities(hmandlm).Theprioronratioofdensitiesfortheobjectscanbederivedusingdensityandthefrequencyofoccurrenceofdi erentmaterialsinourtrainingimages.(2)Stability:Wecomputethelikelihoodofablockbeingstablegiventhedensitycon gurationandsupportrelations.Forthis,we rstcomputethecontactpointsoftheblockwiththesupportingblockandthencomputethetorqueduetogravitationalforceexertedbyeachsuperpixelandtheresultantcontactforcearoundthecontactline(Figure7a).Thisagainleadstotorqueasafunctionofthreeconstantsandweusesimilarqualitativeanalysistocomputethestability.(3)ConstraintsfromBlockStrength:Wealsoderiveconstraintonsupportattributesbasedonthedensitiesofthetwoblockspossiblyinteractingwitheachother.Ifthedensityofthesupportingblockislessthandensityofthesupportedblock;wethenassumethatthetwoblocksarenotinphysicalcontactandtheblockbelowoccludesthecontactoftheblockabovewiththeground.3.6ExtractingDepthConstraintsThedepthorderingconstraints(Figure6(e))areusedtoguidethenextstepofre ningthesegmentationbysplittingandmergingregions.Computingdepthorderingrequiresestimatingpairwisedepthconstraintsonblocksandthenusingthemtoformglobaldepthordering.TherulesforinferringdepthconstraintsareshowninFigure7(b).Thesepairwiseconstraintsarethenusedtogenerateaglobalpartialdepthorderingviaasimpleconstraintsatisfactionapproach.3.7CreatingSplitandMergeProposalsThis nalstepinvolvingchangestothesegmentation(Figure6f)iscrucialbecauseitavoidsthepitfallsofprevioussystemswhichassumeda xed,initialsegmentation(orevenmultiplesegmentations)andwereunabletorecoverfromincorrectorincompletegroupings.Forexample,nosegmentationalgorithmcangrouptworegionsseparatedbyanoccludingobjectbecausesuchamergewouldrequirereasoningaboutdepthordering.ItispreciselythistypeofreasoningthatthedepthorderingestimationofSection3.6enables.Weincludesegmentationintheinterpretationloopandusethecurrentinterpretationofthescenetogeneratemoresegmentsthatcanbeutilizedasblocksintheblocksworld.Usingestimateddepthrelationshipsandblockviewclasseswecreatemergeproposalswheretwoormorenon-adjacentsegmentsarecombinediftheyshareablockasneighborwhichisestimatedtobeinfrontoftheminthecurrentviewpoint.Inthatcase,thesharedblockisinterpretedasanoccluderwhich BlocksWorldRevisited11fragmentedthebackgroundblockintopieceswhichthemergeproposalattemptstoreconnect.Wealsocreateadditionalmergeproposalsbycombingtwoormoreneighboringsegments.Splitproposalsdivideablockintotwoormoreblocksiftheinferredpropertiesoftheblockarenotinagreementwithcon dentindividualcues.Forexample,ifthesurfacelayoutalgorithmestimatesasurfaceasfrontalwithhigh-con denceandourinferredgeometryisnotfrontal,thentheblockisdividedtocreatetwoormoreblocksthatagreewiththesurfacelayout.Thesplitandmergeproposalsarethenevaluatedbyacostfunctionwhosetermsarebasedonthecon denceintheestimatedgeometryandphysicalstabilityofthenewblock(s)comparedtopreviousblock(s).Inourexperiments,approximately11%oftheblocksarecreatedusingtheresegmentationprocedure.4ExperimentalResultsSincetherehasbeensolittledoneintheareaofqualitativevolumetricsceneunderstanding,therearenoestablisheddatasets,evaluationmethodologies,orevenmuchintermsofrelevantpreviousworktocompareagainst.Therefore,wewillpresentourevaluationintwoparts:1)qualitatively,byshowingafewrepresentativesceneparseresultsinthepaper,andawidevarietyofresultsontheprojectwebpage1;2)quantitatively,byevaluatingindividualcomponentsofoursystemand,whenavailable,comparingagainsttherelevantpreviouswork.Dataset:WeusethedatasetandmethodologyofHoiemet.al[9]forcom-parison.Thisdatasetconsistsof300imagesofoutdoorsceneswithgroundtruthsurfaceorientationlabeledforallimages,butocclusionboundariesareonlyla-belledfor100images.The rst50(ofthe100)areusedfortrainingthesurfacesegmentation[8]andocclusionreasoning[10]ofoursegmenter.Theremaining250imagesareusedtoevaluateourblocksworldapproach.Thesurfaceclassi ersaretrainedandtestedusing ve-foldcross-validationjustlikein[9].Qualitative:Figure8showstwoexamplesofcompleteinterpretationau-tomaticallygeneratedbythesystemanda3DtoyblocksworldgeneratedinVRML.Inthetopexample,thebuildingisoccludedbyatreeintheimageandthereforenoneofthepreviousapproachescancombinethetwofacesofthebuild-ingtoproduceasinglebuildingregion.Inapop-upbasedrepresentation,theplacementoftheleftfaceisunconstrainedduetothecontactwithgroundnotbeingvisible.However,inourapproachvolumetricconstraintsaidthereasoningprocessandcombinethetwofacestoproduceablockoccludedbythetree.Thebottomexampleshowshowstaticscanhelpinselectingthebestblocksandimproveblock-viewestimation.Reasoningaboutmechanicalconstraintsrejectsthesegmentcorrespondingtothewholebuilding(duetounbalancedtorque).Fortheselectedconvexblock,thecuesfromgroundandskycontactpointsaidinpropergeometricclassi cationoftheblock.Figure9showsafewotherqualitativeexampleswiththeoverlaidblockandestimatedsurfaceorientations.Quantitative:Weevaluatevariouscomponentsofoursystemseparately.Itisnotpossibletoquantitativelycomparetheperformanceoftheentiresystem 1http://www.cs.cmu.edu/abhinavg/blocksworld