376K - views

Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb uhl Computer Science Department Stanford University philkrcs

stanfordedu Vladlen Koltun Computer Science Department Stanford University vladlencsstanfordedu Abstract Most stateoftheart techniques for multiclass image segmentation and labeling use conditional random 64257elds de64257ned over pixels or image reg

Embed :
Pdf Download Link

Download Pdf - The PPT/PDF document "Efcient Inference in Fully Connected CRF..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb uhl Computer Science Department Stanford University philkrcs






Presentation on theme: "Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb uhl Computer Science Department Stanford University philkrcs"— Presentation transcript:

structurethattilesthefeaturespacewithsimplicesarrangedalongd+1axes[1].ThepermutohedrallatticeexploitstheseparabilityofunitvarianceGaussiankernels.Thusweneedtoapplyawhiteningtransform~f=Uftothefeaturespaceinordertouseit.Thewhiteningtransformationisfoundus-ingtheCholeskydecompositionof(m)intoUUT.Inthetransformedspace,thehigh-dimensionalconvolutioncanbeseparatedintoasequenceofone-dimensionalconvolutionsalongtheaxesofthelattice.Theresultingapproximatemessagepassingprocedureishighlyefcientevenwithafullysequentialimplementationthatdoesnotmakeuseofparallelismorthestreamingcapabilitiesofgraphicshardware,whichcanprovidefurtheraccelerationifdesired.4LearningWelearntheparametersofthemodelbypiecewisetraining.First,theboostedunaryclassiersaretrainedusingtheJointBoostalgorithm[21],usingthefeaturesdescribedinSection5.Nextwelearntheappearancekernelparametersw(1), ,and forthePottsmodel.w(1)canbefoundefcientlybyacombinationofexpectationmaximizationandhigh-dimensionalltering.Unfortunately,thekernelwidths and cannotbecomputedeffectivelywiththisapproach,sincetheirgradientinvolvesasumofnon-Gaussiankernels,whicharenotamenabletothesameaccelerationtechniques.Wefoundittobemoreefcienttousegridsearchonaholdoutvalidationsetforallthreekernelparametersw(1), and .Thesmoothnesskernelparametersw(2)and donotsignicantlyaffectclassicationaccuracy,butyieldasmallvisualimprovement.Wefoundw(2)= =1toworkwellinpractice.Thecompatibilityparameters(a;b)=(b;a)arelearnedusingL-BFGStomaximizethelog-likelihood`(:I;T)ofthemodelforavalidationsetofimagesIwithcorrespondinggroundtruthlabelingsT.L-BFGSrequiresthecomputationofthegradientof`,whichisintractabletoestimateexactly,sinceitrequirescomputingthegradientofthepartitionfunctionZ.Instead,weusethemeaneldapproximationdescribedinSection3toestimatethegradientofZ.Thisleadstoasimpleapproximationofthegradientforeachtrainingimage:@ @(a;b)`(:I(n);T(n))�XiT(n)i(a)Xj6=ik(fi;fj)T(n)j(b)+XiQi(a)Xj6=ik(fi;fj)Qi(b);(6)where(I(n);T(n))isasingletrainingimagewithitsgroundtruthlabelingandT(n)(a)isabinaryimageinwhichtheithpixelT(n)i(a)hasvalue1ifthegroundtruthlabelattheithpixelofT(n)isaand0otherwise.AdetailedderivationofEquation6isgiveninthesupplementarymaterial.ThesumsPj6=ik(fi;fj)Tj(b)andPj6=ik(fi;fj)Qi(b)arebothcomputationallyexpensivetoeval-uatedirectly.AsinSection3.2,weusehigh-dimensionallteringtocomputebothsumsefciently.TheruntimeofthenallearningalgorithmislinearinthenumberofvariablesN.5ImplementationTheunarypotentialsusedinourimplementationarederivedfromTextonBoost[19,13].Weusethe17-dimensionallterbanksuggestedbyShottonetal.[19],andfollowLadick´yetal.[13]byaddingcolor,histogramoforientedgradients(HOG),andpixellocationfeatures.OurevaluationontheMSRC-21datasetusesthisextendedversionofTextonBoostfortheunarypotentials.FortheVOC2010datasetweincludetheresponseofboundingboxobjectdetectors[4]foreachobjectclassas20additionalfeatures.ThisincreasestheperformanceoftheunaryclassiersontheVOC2010from13%to22%.Wegainanadditional5%bytrainingalogisticregressionclassierontheresponsesoftheboostedclassier.Forefcienthigh-dimensionalltering,weuseapubliclyavailableimplementationofthepermuto-hedrallattice[1].Wefoundadownsamplingrateofonestandarddeviationtoworkbestforallourexperiments.Sampling-basedlteringalgorithmsunderestimatetheedgestrengthk(fi;fj)forverysimilarfeaturepoints.Propernormalizationcancanceloutmostofthiserror.Thepermutohedrallatticeallowsfortwotypesofnormalizations.Aglobalnormalizationbytheaveragekernelstrength5 Runtime Standardgroundtruth Accurategroundtruth Global Average Global Average Unaryclassiers � 84:0 76:6 83:21:5 80:62:3 GridCRF 1s 84:6 77:2 84:81:5 82:41:8 RobustPnCRF 30s 84:9 77:5 86:51:0 83:11:5 FullyconnectedCRF 0:2s 86:0 78:3 88:20:7 84:70:7 Figure3:QualitativeandquantitativeresultsontheMSRC-21dataset.rameters.Theunarypotentialswerelearnedonaseparatetrainingsetthatdidnotincludethe94accuratelyannotatedimages.WealsoadoptthemethodologyproposedbyKohlietal.[9]forevaluatingsegmentationaccuracyaroundboundaries.Specically,wecounttherelativenumberofmisclassiedpixelswithinanar-rowband(“trimap”)surroundingactualobjectboundaries,obtainedfromtheaccurategroundtruthimages.AsshowninFigure4,ouralgorithmoutperformspreviousworkacrossalltrimapwidths.PASCALVOC2010.DuetothelackofapubliclyavailablegroundtruthlabelingforthetestsetinthePASCALVOC2010,weusethetrainingandvalidationdataforallourexperiments.Werandomlypartitionedtheimagesinto3groups:40%training,15%validation,and45%testset.Seg-mentationaccuracywasmeasuredusingthestandardVOCmeasure[3].Theunarypotentialswerelearnedonthetrainingsetandyieldedanaverageclassicationaccuracyof27:6%.TheparametersforthePottspotentialsinthefullyconnectedCRFmodelwerelearnedonthevalidationset.The (a)Trimapsofdifferentwidths (b)SegmentationaccuracywithintrimapFigure4:Segmentationaccuracyaroundobjectboundaries.(a)Visualizationofthe“trimap”measure.(b)Percentofmisclassiedpixelswithintrimapsofdifferentwidths.7 References[1]A.Adams,J.Baek,andM.A.Davis.Fasthigh-dimensionallteringusingthepermutohedrallattice.ComputerGraphicsForum,29(2),2010.2,5[2]A.Adams,N.Gelfand,J.Dolson,andM.Levoy.Gaussiankd-treesforfasthigh-dimensionalltering.ACMTransactionsonGraphics,28(3),2009.2[3]M.Everingham,L.VanGool,C.K.I.Williams,J.Winn,andA.Zisserman.ThePASCALVisualObjectClasses(VOC)challenge.IJCV,88(2),2010.6,7[4]P.F.Felzenszwalb,R.B.Girshick,andD.A.McAllester.Cascadeobjectdetectionwithdeformablepartmodels.InProc.CVPR,2010.5[5]B.Fulkerson,A.Vedaldi,andS.Soatto.Classsegmentationandobjectlocalizationwithsuperpixelneighborhoods.InProc.ICCV,2009.1[6]C.Galleguillos,A.Rabinovich,andS.Belongie.Objectcategorizationusingco-occurrence,locationandappearance.InProc.CVPR,2008.1[7]S.Gould,J.Rodgers,D.Cohen,G.Elidan,andD.Koller.Multi-classsegmentationwithrelativelocationprior.IJCV,80(3),2008.1[8]X.He,R.S.Zemel,andM.A.Carreira-Perpinan.Multiscaleconditionalrandomeldsforimagelabeling.InProc.CVPR,2004.1[9]P.Kohli,L.Ladick´y,andP.H.S.Torr.Robusthigherorderpotentialsforenforcinglabelconsistency.IJCV,82(3),2009.1,2,6,7[10]D.KollerandN.Friedman.ProbabilisticGraphicalModels:PrinciplesandTechniques.MITPress,2009.3[11]V.KolmogorovandR.Zabih.Whatenergyfunctionscanbeminimizedviagraphcuts?PAMI,26(2),2004.2[12]S.KumarandM.Hebert.Ahierarchicaleldframeworkforuniedcontext-basedclassication.InProc.ICCV,2005.1[13]L.Ladick´y,C.Russell,P.Kohli,andP.H.S.Torr.Associativehierarchicalcrfsforobjectclassimagesegmentation.InProc.ICCV,2009.1,5[14]L.Ladick´y,C.Russell,P.Kohli,andP.H.S.Torr.Graphcutbasedinferencewithco-occurrencestatistics.InProc.ECCV,2010.1[15]J.D.Lafferty,A.McCallum,andF.C.N.Pereira.Conditionalrandomelds:Probabilisticmodelsforsegmentingandlabelingsequencedata.InProc.ICML,2001.3[16]S.ParisandF.Durand.Afastapproximationofthebilaterallterusingasignalprocessingapproach.IJCV,81(1),2009.2,4[17]N.PayetandS.Todorovic.(RF)2–randomforestrandomeld.InProc.NIPS.2010.1,2[18]A.Rabinovich,A.Vedaldi,C.Galleguillos,E.Wiewiora,andS.Belongie.Objectsincontext.InProc.ICCV,2007.1[19]J.Shotton,J.M.Winn,C.Rother,andA.Criminisi.Textonboostforimageunderstanding:Multi-classobjectrecognitionandsegmentationbyjointlymodelingtexture,layout,andcontext.IJCV,81(1),2009.1,3,5,6[20]S.W.Smith.Thescientistandengineer'sguidetodigitalsignalprocessing.CaliforniaTechnicalPub-lishing,1997.4[21]A.Torralba,K.P.Murphy,andW.T.Freeman.Sharingvisualfeaturesformulticlassandmultiviewobjectdetection.PAMI,29(5),2007.5[22]T.ToyodaandO.Hasegawa.Randomeldmodelforintegrationoflocalinformationandglobalinfor-mation.PAMI,30,2008.1[23]J.J.VerbeekandB.Triggs.Scenesegmentationwithcrfslearnedfrompartiallylabeledimages.InProc.NIPS,2007.19