Supervised Descent Method and its Applications to Face Alignment Xuehan Xiong Fe - PDF document

Supervised Descent Method and its Applications to Face Alignment Xuehan Xiong Fe
Supervised Descent Method and its Applications to Face Alignment Xuehan Xiong Fe

Supervised Descent Method and its Applications to Face Alignment Xuehan Xiong Fe - Description


cmuedu ftorrecscmuedu Abstract Many computer vision problems eg camera calibra tion image alignment structure from motion are solved through a nonlinear optimization method It is generally accepted that nd order descent methods are the most ro bust f ID: 2510 Download Pdf

Tags

cmuedu ftorrecscmuedu Abstract Many computer

Download Section

Please download the presentation from below link :


Download Pdf - The PPT/PDF document "Supervised Descent Method and its Applic..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Embed / Share - Supervised Descent Method and its Applications to Face Alignment Xuehan Xiong Fe


Presentation on theme: "Supervised Descent Method and its Applications to Face Alignment Xuehan Xiong Fe"— Presentation transcript


However,whenapplyingNewton'smethodtocomputervisionproblems,threemainproblemsarise:(1)TheHes-sianispositivedeniteatthelocalminimum,butitmightnotbepositivedeniteelsewhere;therefore,theNewtonstepsmightnotbetakeninthedescentdirection.(2)New-ton'smethodrequiresthefunctiontobetwicedifferen-tiable.Thisisastrongrequirementinmanycomputervi-sionapplications.Forinstance,considerthecaseofimagealignmentusingSIFT[21]features,wheretheSIFTcanbeseenasanon-differentiableimageoperator.Inthesecases,wecanestimatethegradientortheHessiannumerically,butthisistypicallycomputationallyexpensive.(3)Thedimen-sionoftheHessianmatrixcanbelarge;invertingtheHes-sianrequiresO(p3)operationsandO(p2)inspace,wherepisthedimensionoftheparametertoestimate.AlthoughexplicitinversionoftheHessianisnotneededusingQuasi-NetwonmethodssuchasL-BFGS[9],itcanstillbecom-putationallyexpensivetousethesemethodsincomputervi-sionproblems.Inordertoaddresspreviouslimitations,thispaperproposesaSupervisedDescentMethod(SDM)thatlearnsthedescentdirectionsinasupervisedmanner.Fig.1illustratesthemainideaofourmethod.ThetopimageshowstheapplicationofNewton'smethodtoaNon-linearLeastSquares(NLS)problem,wheref(x)isanon-linearfunctionandyisaknownvector.Inthiscase,f(x)isanon-linearfunctionofimagefeatures(e.g.,SIFT)andyisaknownvector(i.e.,template).xrepresentsthevectorofmotionparameters(i.e.,rotation,scale,non-rigidmo-tion).ThetraditionalNewtonupdatehastocomputetheHessianandtheJacobian.Fig.1billustratesthemainideabehindSDM.Thetrainingdataconsistsofasetoffunc-tionsff(x;yi)gsampledatdifferentlocationsyi(i.e.,dif-ferentpeople)wheretheminimafxigareknown.Usingthistrainingdata,SDMlearnsaseriesofparameterupdates,whichincrementally,minimizesthemeanofallNLSfunc-tionsintraining.InthecaseofNLS,suchupdatescanbedecomposedintotwoparts:asamplespeciccomponent(e.g.,yi)andagenericdescentdirectionsRk.SDMlearnsaveragedescentdirectionsRkduringtraining.Intesting,givenanunseeny,anupdateisgeneratedbyprojectingy-speciccomponentsontothelearnedgenericdirectionsRk.WeillustratethebenetsofSDMonanalyticfunc-tions,andintheproblemoffacialfeaturedetectionandtracking.WeshowhowSDMimprovesstate-of-the-artperformanceforfacialfeaturedetectionintwo“faceinthewild”databases[26,4]anddemonstrateextremelygoodperformancetrackingfacesintheYouTubecelebritydatabase[20].2.PreviousworkThissectionreviewspreviousworkonfacealignment.ParameterizedAppearanceModels(PAMs),suchasActiveAppearanceModels[11,14,2],MorphableMod-els[6,19],Eigentracking[5],andtemplatetracking[22,30]buildanobjectappearanceandshaperepresentationbycomputingPrincipalComponentAnalysis(PCA)onasetofmanuallylabeleddata.Fig.2aillustratesanimagelabeledwithplandmarks(p=66inthiscase).AftertheimagesarealignedwithProcrustes,theshapemodelislearnedbycom-putingPCAontheregisteredshapes.Alinearcombinationofksshapebasis,Us22pkscanreconstruct(approxi-mately)anyalignedshapeinthetrainingset.Similarly,anappearancemodel,Ua2mka,isbuiltbyperformingPCAonthetexture.Alignmentisachievedbyndingthemotionparameterpandappearancecoefcientscathatbestalignstheimagew.r.t.thesubspaceUa,i.e.,min.ca;pjjd(f(x;p))�Uacajj22;(2)x=[x1;y1;:::xl;yl]&#x]TJ/;༐ ;.97;8 T; 18;&#x.733;&#x 3.6; T; [0;isthevectorcontainingthecoor-dinatesofthepixelstodetect/track.f(x;p)representsageometrictransformation;thevalueoff(x;p)isavec-tordenotedby[u1;v1;:::;ul;vl]&#x]TJ/;༐ ;.97;8 T; 18;&#x.733;&#x 3.6; T; [0;.d(f(x;p))istheap-pearancevectorofwhichtheithentryistheintensityofimagedatpixel(ui;vi).Forafneandnon-rigidtransformations,(ui;vi)relatesto(xi;yi)byuivi=a1a2a4a5xsiysi+a3a6:Here[xs1;ys1;:::xsl;ysl]&#x]TJ/;༐ ;.97;8 T; 18;&#x.733;&#x 3.6; T; [0;= x+Uscs,where xisthemeanshapeface.a;csareafneandnon-rigidmotionparametersrespectivelyandp=[a;cs].Givenanimaged,PAMsalignmentalgorithmsopti-mizeEq.2.Duetothehighdimensionalityofthemo-tionspace,astandardapproachtoefcientlysearchovertheparameterspaceistousetheGauss-Newtonmethod[5,2,11,14]bydoingaTaylorseriesexpansiontoapproxi-mated(f(x;p+p))d(f(x;p))+Jd(p)p;whereJd(p)=@d(f(x;p)) @pistheJacobianoftheimagedw.r.t.tothemotionparameterp[22].Discriminativeapproacheslearnamappingfromim-agefeaturestomotionparametersorlandmarks.Cootesetal.[11]proposedtotAAMsbylearningalinearre-gressionbetweentheincrementofmotionparameterspandtheappearancedifferencesd.ThelinearregressorisanumericalapproximationoftheJacobian[11].Fol-lowingthisidea,severaldiscriminativemethodsthatlearnamappingfromdtophavebeenproposed.GradientBoosting,rstintroducedbyFriedman[16],hasbecomeoneofthemostpopularregressorsinfacealignmentbe-causeofitsefciencyandtheabilitytomodelnonlinear-ities.SaragihandG¨ocke[27]andTresadernetal.[29]showedthatusingboostedregressionforAAMdiscrimi-nativettingsignicantlyimprovedovertheoriginallin-earformulation.Doll´aretal.[15]incorporated“posein-dexedfeatures”totheboostingframework,wherenotonly (a)x(b)x0Figure2:a)Manuallylabeledimagewith66landmarks.Blueoutlineindicatesfacedetector.b)Meanlandmarks,x0,initializedusingthefacedetector.anewweakregressorislearnedateachiterationbutalsothefeaturesarere-computedatthelatestestimateofthelandmarklocation.Beyondthegradientboosting,RiveraandMartinez[24]exploredkernelregressiontomapfromimagefeaturesdirectlytolandmarklocationachievingsur-prisingresultsforlow-resolutionimages.Recently,Cootesetal.[12]investigatedRandomForestregressorsinthecon-textoffacealignment.Atthesametime,S´anchezetal.[25]proposedtolearnaregressionmodelinthecontinuousdo-maintoefcientlyanduniformlysamplethemotionspace.Inthecontextoftracking,Zimmermannetal.[32]learnedasetofindependentlinearpredictorfordifferentlocalmotionandthenasubsetofthemischosenduringtracking.Part-baseddeformablemodelsperformalignmentbymaximizingtheposteriorlikelihoodofpartlocationsgivenanimage.Theobjectivefunctioniscomposedofthelocallikelihoodofeachparttimesaglobalshapeprior.Differ-entmethodstypicallyvarytheoptimizationmethodsortheshapeprior.ConstrainedLocalModels(CLM)[13]modelthispriorsimilarlyasAAMsassumingallfaceslieinalin-earsubspaceexpandedbyPCAbases.Saragihetal.[28]proposedanon-parametricrepresentationtomodelthepos-teriorlikelihoodandtheresultingoptimizationmethodisreminiscentofmean-shift.In[4],theshapepriorwasmodelednon-parametricallyfromtrainingdata.Recently,Saragih[26]derivedasamplespecicpriortoconstraintheoutputspacethatsignicantlyimprovesovertheorig-inalPCAprior.Insteadofusingaglobalmodel,Huangetal.[18]proposedtobuildseparateGaussianmodelsforeachpart(e.g.,mouth,eyes)topreservemoredetailedlocalshapedeformations.ZhuandRamanan[31]assumedthatthefaceshapeisatreestructure(forfastinference),andusedapart-basedmodelforfacedetection,poseestimation,andfacialfeaturedetection.3.SupervisedDescentMethod(SDM)ThissectiondescribestheSDMinthecontextoffacealignment,anduniesdiscriminativemethodswithPAMs.3.1.DerivationofSDMGivenanimaged2m1ofmpixels,d(x)2p1indexesplandmarksintheimage.hisanon-linearfeatureextractionfunction(e.g.,SIFT)andh(d(x))2128p1inthecaseofextractingSIFTfeatures.Duringtraining,wewillassumethatthecorrectplandmarks(inourcase66)areknown,andwewillrefertothemasx(seeFig.2a).Also,toreproducethetestingscenario,weranthefacedetectoronthetrainingimagestoprovideaninitialcongurationofthelandmarks(x0),whichcorrespondstoanaverageshape(seeFig.2b).Inthissetting,facealignmentcanbeframedasminimizingthefollowingfunctionoverxf(x0+x)=kh(d(x0+x))�k22;(3)where=h(d(x))representstheSIFTvaluesinthemanuallylabeledlandmarks.Inthetrainingimages,andxareknown.Eq.3hasseveralfundamentaldifferenceswithpreviousworkonPAMsinEq.2.First,inEq.3wedonotlearnanymodelofshapeorappearancebeforehandfromtrain-ingdata.Wealigntheimagew.r.t.atemplate.Fortheshape,ourmodelwillbeanon-parametricone,andwewilloptimizethelandmarklocationsx22p1directly.RecallthatintraditionalPAMs,thenon-rigidmotionismodeledasalinearcombinationofshapebaseslearnedbycomputingPCAonatrainingset.Ournon-parametricshapemodelisabletogeneralizebettertountrainedsituations(e.g.,asym-metricfacialgestures).Second,weuseSIFTfeaturesex-tractedfrompatchesaroundthelandmarkstoachievearo-bustrepresentationagainstillumination.ObservethattheSIFToperatorisnotdifferentiableandminimizingEq.3usingrstorsecondordermethodsrequiresnumericalap-proximations(e.g.,nitedifferences)oftheJacobianandtheHessian.However,numericalapproximationsareverycomputationallyexpensive.ThegoalofSDMistolearnaseriesofdescentdirectionsandre-scalingfactors(donebytheHessianinthecaseofNewton'smethod)suchthatitproducesasequenceofupdates(xk+1=xk+xk)startingfromx0thatconvergestoxinthetrainingdata.Now,onlyforderivationpurposes,wewillassumethathistwicedifferentiable.Suchassumptionwillbedroppedatalaterpartofthesection.SimilartoNewton'smethod,weapplyasecondorderTaylorexpansiontoEq.3as,f(x0+x)f(x0)+Jf(x0)&#x]TJ/;÷ 6;&#x.973; Tf;&#x 16.;؅ ;.61; Td;&#x [00;x+1 2x�H(x0)x;(4)whereJf(x0)andH(x0)aretheJacobianandHessianma-tricesoffevaluatedatx0.Inthefollowing,wewillomitx0tosimplifythenotation.Differentiating(4)withrespecttoxandsettingittozerogivesustherstupdateforx,x1=�H�1Jf=�2H�1J�h(0�);(5) Function TrainingSet TestSeth(x) yx=h�1(y) y sin(x) [-1:0.2:1]arcsin(y) [-1:0.05:1]x3 [-27:3:27]y1 3 [-27:0.5:27]erf(x) [-0.99:0.11:0.99]erf�1(y) [-0.99:0.03:0.99]ex [1:3:28]log(y) [1:0.5:28] Table1:ExperimentalsetupfortheSDMonanalyticfunctions.erf(x)istheerrorfunction,erf(x)=2 p Rx0e�t2dt.4.ExperimentsThissectionreportsexperimentalresultsonbothsyn-theticandrealdata.TherstexperimentcomparestheSDMwiththeNewtonmethodinfouranalyticfunctions.Inthesecondexperiment,wetestedtheperformanceoftheSDMintheproblemoffacialfeaturedetectionintwostandarddatabases.Finally,inthethirdexperimentweillustratehowthemethodcanbeappliedtofacialfeaturetracking.4.1.SDMonanalyticscalarfunctionsThisexperimentcomparestheperformanceinspeedandaccuracyoftheSDMagainsttheNewton'smethodonfouranalyticfunctions.TheNLSproblemthatweoptimizeis:minxf(x)=(h(x)�y)2;whereh(x)isascalarfunction(seeTable1)andyisagivenconstant.Observethatthe1stand2ndderivativesofthosefunctionscanbederivedanalytically.Assumethatwehaveaxedinitializationx0=candwearegivenasetoftrainingdatax=fxigni=1andy=fh(xi)gni=1.UnliketheSDMforfacealignment,inthiscasenobiastermislearnedsinceyisknownattestingtime.WetrainedtheSDMasexplainedinSec.3.2.ThetrainingandtestingsetupforeachfunctionareshowninTable1inMatlabnotation.Wehavechosenonlyinvertiblefunctions.Otherwise,foragivenymultipleso-lutionsmaybeobtained.Inthetrainingdata,theoutputvariablesyaresampleduniformlyinalocalregionofh(x),andtheircorrespondinginputsxarecomputedbyevaluat-ingyattheinversefunctionofh(x).Thetestdatayisgeneratedatanerresolutionthanintraining.Tomeasuretheaccuracyofbothmethods,wecomputedthenormalizedleastsquareresidualskxk�xk kxkattherst10steps.Fig.3showstheconvergencecomparisonbe-tweenSDMandNewtonmethod.Surprisingly,SDMcon-vergeswiththesamenumberofiterationasNewtonmethodbuteachiterationisfaster.Moreover,SDMismorerobustagainstbadinitializationsandill-conditions(f000).Forexample,whenh(x)=x3theNewtonmethodstartsfromasaddlepointandstaysthereinthefollowingiterations(ob-servethatintheFig.3theNewtonmethodstaysat1).In Figure3:Normalizederrorversusiterationsonfouranalytic(seeTable1)functionsusingtheNewtonmethodandSDM.thecaseofh(x)=ex,theNewtonmethoddivergesbe-causeitisill-conditioned.Notsurprisingly,whentheNew-tonmethodconvergesitprovidesmoreaccurateestimationthanSDM,becauseSDMusesagenericdescentdirection.Iffisquadratic(e.g.,hislinearfunctionofx),SDMwillconvergeinoneiteration,becausetheaveragegradienteval-uatedatdifferentlocationswillbethesameforlinearfunc-tions.Thiscoincideswithawell-knownfactthatNewtonmethodconvergesinoneiterationforquadraticfunctions.4.2.FacialfeaturedetectionThissectionreportsexperimentsonfacialfeaturedetec-tionintwo“faceinthewild”datasets,andcomparesSDMwithstate-of-the-artmethods.ThetwofacedatabasesaretheLFPWdataset1[4]andtheLFW-A&Cdataset[26].Theexperimentalsetupisasfollows.Firstthefaceisde-tectedusingtheOpenCVfacedetector[7].Theevaluationisperformedontheimagesinwhichafacecanbedetected.Thefacedetectionratesare96.7%onLFPWand98.7%onLFW-A&C,respectively.Theinitialshapeestimateisgivenbycenteringthemeanfaceatthenormalizedsquare.Thetranslationalandscalingdifferencesbetweentheinitialandtruelandmarklocationsarealsocomputed,andtheirmeansandvariancesareusedforgeneratingMonteCarlosamplesinEq.9.Wegenerated10perturbedsamplesforeachtrain-ingimage.SIFTdescriptorsarecomputedon3232localpatches.Toreducethedimensionalityofthedata,weper-formedPCApreserving98%oftheenergyontheimagefeatures.LFPWdatasetcontainsimagesdownloadedfromthewebthatexhibitlargevariationsinpose,illumination,andfacialexpression.Unfortunately,onlyimageURLsaregivenandsomearenolongervalid.Wedownloaded884 1http://www.kbvt.com/LFPW/ Figure6:ExampleresultsfromourmethodonLFPWdataset.Thersttworowsshowfaceswithstrongchangesinposeandillumination,andfacespartiallyoccluded.Thelastrowshowsthe10worstimagesmeasuredbynormalizedmeanerror. Figure7:ExampleresultsonLFW-A&Cdataset. Figure8:ComparisonbetweenthetrackingresultsfromSDM(toprow)andperson-specictracker(bottomrow). Figure9:ExampleresultsontheYoutubeCelebritydataset.

Shom More....