/
Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb

Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb - PDF document

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
746 views
Uploaded On 2014-10-14

Efcient Inference in Fully Connected CRFs with Gaussian Edge Potentials Philipp Kr ahenb - PPT Presentation

stanfordedu Vladlen Koltun Computer Science Department Stanford University vladlencsstanfordedu Abstract Most stateoftheart techniques for multiclass image segmentation and labeling use conditional random 64257elds de64257ned over pixels or image reg ID: 4758

stanfordedu Vladlen Koltun Computer Science

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Efcient Inference in Fully Connected CRF..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

structurethattilesthefeaturespacewithsimplicesarrangedalongd+1axes[1].ThepermutohedrallatticeexploitstheseparabilityofunitvarianceGaussiankernels.Thusweneedtoapplyawhiteningtransform~f=Uftothefeaturespaceinordertouseit.Thewhiteningtransformationisfoundus-ingtheCholeskydecompositionof(m)intoUUT.Inthetransformedspace,thehigh-dimensionalconvolutioncanbeseparatedintoasequenceofone-dimensionalconvolutionsalongtheaxesofthelattice.Theresultingapproximatemessagepassingprocedureishighlyefcientevenwithafullysequentialimplementationthatdoesnotmakeuseofparallelismorthestreamingcapabilitiesofgraphicshardware,whichcanprovidefurtheraccelerationifdesired.4LearningWelearntheparametersofthemodelbypiecewisetraining.First,theboostedunaryclassiersaretrainedusingtheJointBoostalgorithm[21],usingthefeaturesdescribedinSection5.Nextwelearntheappearancekernelparametersw(1), ,and forthePottsmodel.w(1)canbefoundefcientlybyacombinationofexpectationmaximizationandhigh-dimensionalltering.Unfortunately,thekernelwidths and cannotbecomputedeffectivelywiththisapproach,sincetheirgradientinvolvesasumofnon-Gaussiankernels,whicharenotamenabletothesameaccelerationtechniques.Wefoundittobemoreefcienttousegridsearchonaholdoutvalidationsetforallthreekernelparametersw(1), and .Thesmoothnesskernelparametersw(2)and donotsignicantlyaffectclassicationaccuracy,butyieldasmallvisualimprovement.Wefoundw(2)= =1toworkwellinpractice.Thecompatibilityparameters(a;b)=(b;a)arelearnedusingL-BFGStomaximizethelog-likelihood`(:I;T)ofthemodelforavalidationsetofimagesIwithcorrespondinggroundtruthlabelingsT.L-BFGSrequiresthecomputationofthegradientof`,whichisintractabletoestimateexactly,sinceitrequirescomputingthegradientofthepartitionfunctionZ.Instead,weusethemeaneldapproximationdescribedinSection3toestimatethegradientofZ.Thisleadstoasimpleapproximationofthegradientforeachtrainingimage:@ @(a;b)`(:I(n);T(n))�XiT(n)i(a)Xj6=ik(fi;fj)T(n)j(b)+XiQi(a)Xj6=ik(fi;fj)Qi(b);(6)where(I(n);T(n))isasingletrainingimagewithitsgroundtruthlabelingandT(n)(a)isabinaryimageinwhichtheithpixelT(n)i(a)hasvalue1ifthegroundtruthlabelattheithpixelofT(n)isaand0otherwise.AdetailedderivationofEquation6isgiveninthesupplementarymaterial.ThesumsPj6=ik(fi;fj)Tj(b)andPj6=ik(fi;fj)Qi(b)arebothcomputationallyexpensivetoeval-uatedirectly.AsinSection3.2,weusehigh-dimensionallteringtocomputebothsumsefciently.TheruntimeofthenallearningalgorithmislinearinthenumberofvariablesN.5ImplementationTheunarypotentialsusedinourimplementationarederivedfromTextonBoost[19,13].Weusethe17-dimensionallterbanksuggestedbyShottonetal.[19],andfollowLadick´yetal.[13]byaddingcolor,histogramoforientedgradients(HOG),andpixellocationfeatures.OurevaluationontheMSRC-21datasetusesthisextendedversionofTextonBoostfortheunarypotentials.FortheVOC2010datasetweincludetheresponseofboundingboxobjectdetectors[4]foreachobjectclassas20additionalfeatures.ThisincreasestheperformanceoftheunaryclassiersontheVOC2010from13%to22%.Wegainanadditional5%bytrainingalogisticregressionclassierontheresponsesoftheboostedclassier.Forefcienthigh-dimensionalltering,weuseapubliclyavailableimplementationofthepermuto-hedrallattice[1].Wefoundadownsamplingrateofonestandarddeviationtoworkbestforallourexperiments.Sampling-basedlteringalgorithmsunderestimatetheedgestrengthk(fi;fj)forverysimilarfeaturepoints.Propernormalizationcancanceloutmostofthiserror.Thepermutohedrallatticeallowsfortwotypesofnormalizations.Aglobalnormalizationbytheaveragekernelstrength5 Runtime Standardgroundtruth Accurategroundtruth Global Average Global Average Unaryclassiers � 84:0 76:6 83:21:5 80:62:3 GridCRF 1s 84:6 77:2 84:81:5 82:41:8 RobustPnCRF 30s 84:9 77:5 86:51:0 83:11:5 FullyconnectedCRF 0:2s 86:0 78:3 88:20:7 84:70:7 Figure3:QualitativeandquantitativeresultsontheMSRC-21dataset.rameters.Theunarypotentialswerelearnedonaseparatetrainingsetthatdidnotincludethe94accuratelyannotatedimages.WealsoadoptthemethodologyproposedbyKohlietal.[9]forevaluatingsegmentationaccuracyaroundboundaries.Specically,wecounttherelativenumberofmisclassiedpixelswithinanar-rowband(“trimap”)surroundingactualobjectboundaries,obtainedfromtheaccurategroundtruthimages.AsshowninFigure4,ouralgorithmoutperformspreviousworkacrossalltrimapwidths.PASCALVOC2010.DuetothelackofapubliclyavailablegroundtruthlabelingforthetestsetinthePASCALVOC2010,weusethetrainingandvalidationdataforallourexperiments.Werandomlypartitionedtheimagesinto3groups:40%training,15%validation,and45%testset.Seg-mentationaccuracywasmeasuredusingthestandardVOCmeasure[3].Theunarypotentialswerelearnedonthetrainingsetandyieldedanaverageclassicationaccuracyof27:6%.TheparametersforthePottspotentialsinthefullyconnectedCRFmodelwerelearnedonthevalidationset.The (a)Trimapsofdifferentwidths (b)SegmentationaccuracywithintrimapFigure4:Segmentationaccuracyaroundobjectboundaries.(a)Visualizationofthe“trimap”measure.(b)Percentofmisclassiedpixelswithintrimapsofdifferentwidths.7 References[1]A.Adams,J.Baek,andM.A.Davis.Fasthigh-dimensionallteringusingthepermutohedrallattice.ComputerGraphicsForum,29(2),2010.2,5[2]A.Adams,N.Gelfand,J.Dolson,andM.Levoy.Gaussiankd-treesforfasthigh-dimensionalltering.ACMTransactionsonGraphics,28(3),2009.2[3]M.Everingham,L.VanGool,C.K.I.Williams,J.Winn,andA.Zisserman.ThePASCALVisualObjectClasses(VOC)challenge.IJCV,88(2),2010.6,7[4]P.F.Felzenszwalb,R.B.Girshick,andD.A.McAllester.Cascadeobjectdetectionwithdeformablepartmodels.InProc.CVPR,2010.5[5]B.Fulkerson,A.Vedaldi,andS.Soatto.Classsegmentationandobjectlocalizationwithsuperpixelneighborhoods.InProc.ICCV,2009.1[6]C.Galleguillos,A.Rabinovich,andS.Belongie.Objectcategorizationusingco-occurrence,locationandappearance.InProc.CVPR,2008.1[7]S.Gould,J.Rodgers,D.Cohen,G.Elidan,andD.Koller.Multi-classsegmentationwithrelativelocationprior.IJCV,80(3),2008.1[8]X.He,R.S.Zemel,andM.A.Carreira-Perpinan.Multiscaleconditionalrandomeldsforimagelabeling.InProc.CVPR,2004.1[9]P.Kohli,L.Ladick´y,andP.H.S.Torr.Robusthigherorderpotentialsforenforcinglabelconsistency.IJCV,82(3),2009.1,2,6,7[10]D.KollerandN.Friedman.ProbabilisticGraphicalModels:PrinciplesandTechniques.MITPress,2009.3[11]V.KolmogorovandR.Zabih.Whatenergyfunctionscanbeminimizedviagraphcuts?PAMI,26(2),2004.2[12]S.KumarandM.Hebert.Ahierarchicaleldframeworkforuniedcontext-basedclassication.InProc.ICCV,2005.1[13]L.Ladick´y,C.Russell,P.Kohli,andP.H.S.Torr.Associativehierarchicalcrfsforobjectclassimagesegmentation.InProc.ICCV,2009.1,5[14]L.Ladick´y,C.Russell,P.Kohli,andP.H.S.Torr.Graphcutbasedinferencewithco-occurrencestatistics.InProc.ECCV,2010.1[15]J.D.Lafferty,A.McCallum,andF.C.N.Pereira.Conditionalrandomelds:Probabilisticmodelsforsegmentingandlabelingsequencedata.InProc.ICML,2001.3[16]S.ParisandF.Durand.Afastapproximationofthebilaterallterusingasignalprocessingapproach.IJCV,81(1),2009.2,4[17]N.PayetandS.Todorovic.(RF)2–randomforestrandomeld.InProc.NIPS.2010.1,2[18]A.Rabinovich,A.Vedaldi,C.Galleguillos,E.Wiewiora,andS.Belongie.Objectsincontext.InProc.ICCV,2007.1[19]J.Shotton,J.M.Winn,C.Rother,andA.Criminisi.Textonboostforimageunderstanding:Multi-classobjectrecognitionandsegmentationbyjointlymodelingtexture,layout,andcontext.IJCV,81(1),2009.1,3,5,6[20]S.W.Smith.Thescientistandengineer'sguidetodigitalsignalprocessing.CaliforniaTechnicalPub-lishing,1997.4[21]A.Torralba,K.P.Murphy,andW.T.Freeman.Sharingvisualfeaturesformulticlassandmultiviewobjectdetection.PAMI,29(5),2007.5[22]T.ToyodaandO.Hasegawa.Randomeldmodelforintegrationoflocalinformationandglobalinfor-mation.PAMI,30,2008.1[23]J.J.VerbeekandB.Triggs.Scenesegmentationwithcrfslearnedfrompartiallylabeledimages.InProc.NIPS,2007.19