stanfordedu Vladlen Koltun Computer Science Department Stanford University vladlencsstanfordedu Abstract Most stateoftheart techniques for multiclass image segmentation and labeling use conditional random 64257elds de64257ned over pixels or image reg ID: 4758
Download Pdf The PPT/PDF document "Efcient Inference in Fully Connected CRF..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
structurethattilesthefeaturespacewithsimplicesarrangedalongd+1axes[1].ThepermutohedrallatticeexploitstheseparabilityofunitvarianceGaussiankernels.Thusweneedtoapplyawhiteningtransform~f=Uftothefeaturespaceinordertouseit.Thewhiteningtransformationisfoundus-ingtheCholeskydecompositionof(m)intoUUT.Inthetransformedspace,thehigh-dimensionalconvolutioncanbeseparatedintoasequenceofone-dimensionalconvolutionsalongtheaxesofthelattice.Theresultingapproximatemessagepassingprocedureishighlyefcientevenwithafullysequentialimplementationthatdoesnotmakeuseofparallelismorthestreamingcapabilitiesofgraphicshardware,whichcanprovidefurtheraccelerationifdesired.4LearningWelearntheparametersofthemodelbypiecewisetraining.First,theboostedunaryclassiersaretrainedusingtheJointBoostalgorithm[21],usingthefeaturesdescribedinSection5.Nextwelearntheappearancekernelparametersw(1),,andforthePottsmodel.w(1)canbefoundefcientlybyacombinationofexpectationmaximizationandhigh-dimensionalltering.Unfortunately,thekernelwidthsandcannotbecomputedeffectivelywiththisapproach,sincetheirgradientinvolvesasumofnon-Gaussiankernels,whicharenotamenabletothesameaccelerationtechniques.Wefoundittobemoreefcienttousegridsearchonaholdoutvalidationsetforallthreekernelparametersw(1),and.Thesmoothnesskernelparametersw(2)and donotsignicantlyaffectclassicationaccuracy,butyieldasmallvisualimprovement.Wefoundw(2)= =1toworkwellinpractice.Thecompatibilityparameters(a;b)=(b;a)arelearnedusingL-BFGStomaximizethelog-likelihood`(:I;T)ofthemodelforavalidationsetofimagesIwithcorrespondinggroundtruthlabelingsT.L-BFGSrequiresthecomputationofthegradientof`,whichisintractabletoestimateexactly,sinceitrequirescomputingthegradientofthepartitionfunctionZ.Instead,weusethemeaneldapproximationdescribedinSection3toestimatethegradientofZ.Thisleadstoasimpleapproximationofthegradientforeachtrainingimage:@ @(a;b)`(:I(n);T(n))XiT(n)i(a)Xj6=ik(fi;fj)T(n)j(b)+XiQi(a)Xj6=ik(fi;fj)Qi(b);(6)where(I(n);T(n))isasingletrainingimagewithitsgroundtruthlabelingandT(n)(a)isabinaryimageinwhichtheithpixelT(n)i(a)hasvalue1ifthegroundtruthlabelattheithpixelofT(n)isaand0otherwise.AdetailedderivationofEquation6isgiveninthesupplementarymaterial.ThesumsPj6=ik(fi;fj)Tj(b)andPj6=ik(fi;fj)Qi(b)arebothcomputationallyexpensivetoeval-uatedirectly.AsinSection3.2,weusehigh-dimensionallteringtocomputebothsumsefciently.TheruntimeofthenallearningalgorithmislinearinthenumberofvariablesN.5ImplementationTheunarypotentialsusedinourimplementationarederivedfromTextonBoost[19,13].Weusethe17-dimensionallterbanksuggestedbyShottonetal.[19],andfollowLadick´yetal.[13]byaddingcolor,histogramoforientedgradients(HOG),andpixellocationfeatures.OurevaluationontheMSRC-21datasetusesthisextendedversionofTextonBoostfortheunarypotentials.FortheVOC2010datasetweincludetheresponseofboundingboxobjectdetectors[4]foreachobjectclassas20additionalfeatures.ThisincreasestheperformanceoftheunaryclassiersontheVOC2010from13%to22%.Wegainanadditional5%bytrainingalogisticregressionclassierontheresponsesoftheboostedclassier.Forefcienthigh-dimensionalltering,weuseapubliclyavailableimplementationofthepermuto-hedrallattice[1].Wefoundadownsamplingrateofonestandarddeviationtoworkbestforallourexperiments.Sampling-basedlteringalgorithmsunderestimatetheedgestrengthk(fi;fj)forverysimilarfeaturepoints.Propernormalizationcancanceloutmostofthiserror.Thepermutohedrallatticeallowsfortwotypesofnormalizations.Aglobalnormalizationbytheaveragekernelstrength5 Runtime Standardgroundtruth Accurategroundtruth Global Average Global Average Unaryclassiers 84:0 76:6 83:21:5 80:62:3 GridCRF 1s 84:6 77:2 84:81:5 82:41:8 RobustPnCRF 30s 84:9 77:5 86:51:0 83:11:5 FullyconnectedCRF 0:2s 86:0 78:3 88:20:7 84:70:7 Figure3:QualitativeandquantitativeresultsontheMSRC-21dataset.rameters.Theunarypotentialswerelearnedonaseparatetrainingsetthatdidnotincludethe94accuratelyannotatedimages.WealsoadoptthemethodologyproposedbyKohlietal.[9]forevaluatingsegmentationaccuracyaroundboundaries.Specically,wecounttherelativenumberofmisclassiedpixelswithinanar-rowband(trimap)surroundingactualobjectboundaries,obtainedfromtheaccurategroundtruthimages.AsshowninFigure4,ouralgorithmoutperformspreviousworkacrossalltrimapwidths.PASCALVOC2010.DuetothelackofapubliclyavailablegroundtruthlabelingforthetestsetinthePASCALVOC2010,weusethetrainingandvalidationdataforallourexperiments.Werandomlypartitionedtheimagesinto3groups:40%training,15%validation,and45%testset.Seg-mentationaccuracywasmeasuredusingthestandardVOCmeasure[3].Theunarypotentialswerelearnedonthetrainingsetandyieldedanaverageclassicationaccuracyof27:6%.TheparametersforthePottspotentialsinthefullyconnectedCRFmodelwerelearnedonthevalidationset.The (a)Trimapsofdifferentwidths (b)SegmentationaccuracywithintrimapFigure4:Segmentationaccuracyaroundobjectboundaries.(a)Visualizationofthetrimapmeasure.(b)Percentofmisclassiedpixelswithintrimapsofdifferentwidths.7 References[1]A.Adams,J.Baek,andM.A.Davis.Fasthigh-dimensionallteringusingthepermutohedrallattice.ComputerGraphicsForum,29(2),2010.2,5[2]A.Adams,N.Gelfand,J.Dolson,andM.Levoy.Gaussiankd-treesforfasthigh-dimensionalltering.ACMTransactionsonGraphics,28(3),2009.2[3]M.Everingham,L.VanGool,C.K.I.Williams,J.Winn,andA.Zisserman.ThePASCALVisualObjectClasses(VOC)challenge.IJCV,88(2),2010.6,7[4]P.F.Felzenszwalb,R.B.Girshick,andD.A.McAllester.Cascadeobjectdetectionwithdeformablepartmodels.InProc.CVPR,2010.5[5]B.Fulkerson,A.Vedaldi,andS.Soatto.Classsegmentationandobjectlocalizationwithsuperpixelneighborhoods.InProc.ICCV,2009.1[6]C.Galleguillos,A.Rabinovich,andS.Belongie.Objectcategorizationusingco-occurrence,locationandappearance.InProc.CVPR,2008.1[7]S.Gould,J.Rodgers,D.Cohen,G.Elidan,andD.Koller.Multi-classsegmentationwithrelativelocationprior.IJCV,80(3),2008.1[8]X.He,R.S.Zemel,andM.A.Carreira-Perpinan.Multiscaleconditionalrandomeldsforimagelabeling.InProc.CVPR,2004.1[9]P.Kohli,L.Ladick´y,andP.H.S.Torr.Robusthigherorderpotentialsforenforcinglabelconsistency.IJCV,82(3),2009.1,2,6,7[10]D.KollerandN.Friedman.ProbabilisticGraphicalModels:PrinciplesandTechniques.MITPress,2009.3[11]V.KolmogorovandR.Zabih.Whatenergyfunctionscanbeminimizedviagraphcuts?PAMI,26(2),2004.2[12]S.KumarandM.Hebert.Ahierarchicaleldframeworkforuniedcontext-basedclassication.InProc.ICCV,2005.1[13]L.Ladick´y,C.Russell,P.Kohli,andP.H.S.Torr.Associativehierarchicalcrfsforobjectclassimagesegmentation.InProc.ICCV,2009.1,5[14]L.Ladick´y,C.Russell,P.Kohli,andP.H.S.Torr.Graphcutbasedinferencewithco-occurrencestatistics.InProc.ECCV,2010.1[15]J.D.Lafferty,A.McCallum,andF.C.N.Pereira.Conditionalrandomelds:Probabilisticmodelsforsegmentingandlabelingsequencedata.InProc.ICML,2001.3[16]S.ParisandF.Durand.Afastapproximationofthebilaterallterusingasignalprocessingapproach.IJCV,81(1),2009.2,4[17]N.PayetandS.Todorovic.(RF)2randomforestrandomeld.InProc.NIPS.2010.1,2[18]A.Rabinovich,A.Vedaldi,C.Galleguillos,E.Wiewiora,andS.Belongie.Objectsincontext.InProc.ICCV,2007.1[19]J.Shotton,J.M.Winn,C.Rother,andA.Criminisi.Textonboostforimageunderstanding:Multi-classobjectrecognitionandsegmentationbyjointlymodelingtexture,layout,andcontext.IJCV,81(1),2009.1,3,5,6[20]S.W.Smith.Thescientistandengineer'sguidetodigitalsignalprocessing.CaliforniaTechnicalPub-lishing,1997.4[21]A.Torralba,K.P.Murphy,andW.T.Freeman.Sharingvisualfeaturesformulticlassandmultiviewobjectdetection.PAMI,29(5),2007.5[22]T.ToyodaandO.Hasegawa.Randomeldmodelforintegrationoflocalinformationandglobalinfor-mation.PAMI,30,2008.1[23]J.J.VerbeekandB.Triggs.Scenesegmentationwithcrfslearnedfrompartiallylabeledimages.InProc.NIPS,2007.19