/
Learning with compressible priors Learning with compressible priors

Learning with compressible priors - PDF document

conchita-marotz
conchita-marotz . @conchita-marotz
Follow
373 views
Uploaded On 2017-03-19

Learning with compressible priors - PPT Presentation

Table1ExampledistributionsandtheswpRparametersoftheiriidrealizations Distribution pdf R p GeneralizedPareto q 21jxj q1 N1q q Studentst q12 p 2q21x2 2q12 h2q1 ID: 329579

Table1:Exampledistributionsandthesw`p(R)parametersoftheiriidrealizations Distribution pdf R p GeneralizedPareto q 21+jxj (q+1) N1=q q Student'st ((q+1)=2) p 2(q=2)1+x2 2(q+1)=2 h2((q+1)

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Learning with compressible priors" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

observationsareamplied[1,4].Tractabledecodingisimportantforpracticalreasonsaswehavelimitedtimeandresources,anditcanclearlyrestricttheclassofusablesignalpriors.Inthispaper,wedescribecompressiblepriordistributionswhoseindependentandidenticallydis-tributed(iid)realizationsresultincompressiblesignals.Asignaliscompressiblewhensortedmag-nitudesofitscoefcientsexhibitapower-lawdecay.Forcertaindecayrates,compressiblesignalsliveclosetothesparsesignals,i.e.,theycanbewell-approximatedbysparsesignals.Itiswell-knownthatthesetofK-sparsesignalshasstableandtractableencoder-decoderpairs(;)forMassmallasO(Klog(N=K))[1,5].Hence,anN-dimensionalcompressiblesignalwiththeproperdecayrateinheritstheencoder-decoderpairsofitsK-sparseapproximationforagivenapproxima-tionerror,andcanbestablyembeddedintodimensionslogarithmicinN.Compressiblepriorsanalyticallysummarizethesetofcompressiblesignalsandshednewlightonunderdeterminedlinearregressionproblemsbybuildingupontheliteratureonsparsesignalrecov-ery.Ourmainresultsaresummarizedasfollows:1)Byusingorderstatistics,weshowthatthecompressibilityoftheiidrealizationsofgeneralizedPareto,Student'st,Fr´echet,andlog-logisticsdistributionsisindependentofthesignals'dimension.Thesedistributionsarenaturalmembersofcompressiblepriors:theytrulysupportlogarithmicdi-mensionalityreductionandhaveimportantparameterlearningguaranteesfromnitesamplesizes.Wedemonstratethatprobabilisticmodelsforthewaveletcoefcientsofnaturalimagesmustalsobeanaturalmemberofcompressiblepriors.2)WepointoutacommonmisconceptionaboutthegeneralizedGaussiandistribution(GGD):GGDgeneratessignalsthatlosetheircompressibilityasNgrows.Forinstance,specialcasesoftheGGDdistribution,e.g.,Laplaciandistribution,arecommonlyusedassparsitypromotingpriorsinCSandSBLproblemswhereMisassumedtogrowlogarithmicallywithN[1–3,6].WeshowthatsignalsgeneratedfromLaplaciandistributioncanonlybestablyembeddedintolowerdimensionsthatgrowproportionaltoN.Hence,weidentifyaninconsistencybetweenthedecodingalgorithmsmotivatedbytheGGDdistributionandtheirsparsesolutions.3)WeusecompressiblepriorsasascaffoldtobuildnewdecodingalgorithmsbasedonBayesianinferencearguments.Theobjectiveofthesealgorithmsistoapproximatethesignalrealizationfromacompressiblepriorasopposedtopragmaticallyproducingsparsesolutions.Someofthesenewalgorithmsarevariantsofthepopulariterativere-weightingschemes[3,6–8].Weshowhowthetun-ingofthesealgorithmsexplicitlydependsonthecompressiblepriorparameters,andhowtolearntheparametersofthesignal'scompressiblepriorontheywhilerecoveringthesignal.Thepaperisorganizedasfollows.Section2providesthenecessarybackgroundonsparsesignalrecovery.Section3mathematicallydescribesthecompressiblesignalsandtiesthemwiththeorderstatisticsofdistributionstointroducecompressiblepriors.Section4denescompressiblepriors,identiescommonmisconceptionsabouttheGGDdistribution,andexaminesnaturalimagesasin-stancesofcompressiblepriors.Section5derivesnewdecodingalgorithmsforunderdeterminedlinearregressionproblems.Section6describesanalgorithmforlearningtheparametersofcom-pressiblepriors.Section7providessimulationsresultsandisfollowedbyourconclusions.2BackgroundonSparseSignalsAnysignalx2RNcanberepresentedintermsofNcoefcients N1inabasis NNviax= .SignalxhasasparserepresentationifonlyKNentriesof arenonzero.Toaccountforsparsesignalsinanappropriatebasis,(1)shouldbemodiedasy=x+n= +n.LetKdenotethesetofallK-sparsesignals.Whenin(1)satisestheso-calledrestrictedisometryproperty(RIP),itcanbeshownthat denesabi-LipschitzembeddingofKintoRM[1,4,5].Moreover,RIPimpliestherecoveryofK-sparsesignalstowithinagivenerrorbound,andthebestattainablelowerboundsforMarerelatedtotheGelfandwidthofK,whichislog-arithmicinthesignaldimension,i.e.,M=O(Klog(N=K))[5].Withoutlossofgenerality,werestrictourattentioninthesequeltocanonicallysparsesignalsandassumethat =I(theNNidentitymatrix)sothatx= .WiththesparsitypriorandRIPassumptions,inversemapscanbeobtainedbysolvingthefollowingconvexproblems:1(y)=argminkx0k1s.t.y=x0;2(y)=argminkx0k1s.t.ky�x0k2;3(y)=argminkx0k1+ky�x0k22;(2)2 Table1:Exampledistributionsandthesw`p(R)parametersoftheiriidrealizations Distribution pdf R p GeneralizedPareto q 21+jxj �(q+1) N1=q q Student'st �((q+1)=2) p 2�(q=2)1+x2 2�(q+1)=2 h2�((q+1)=2) p q�(q=2)i1=qN1=q q Fr´echet (q=)(x=)�(q+1)e�(x=)�q N1=q q Log-Logistic (q=)(x=)q�1 [1+(x=)q]2 N1=q q GeneralizedGaussian q 2�(1=q)e�(jxj=)q maxf1;�(1+1=q)glog1=q(N=q) qlog(N=q) Weibull (q=)(x=)q�1e�(x=)q log1=qN qlogN Gamma 1 �(q)(x=)q�1e�x= maxf1;�(1+1=q)qglog(qN) log(qN) Log-Normal q p 2xe�(qlog(x=))2=2 ep 2logN=q p 2logNq 4CompressiblePriorsAcompressiblepriorf(x;)in`risapdfwithparameterswhoseMQFsatisesF?1�i N+1.R(N;)i�1=p(N;);whereR�0andpr:(9)Table4listsexamplepdf's,parameterizedby=(q;)0,andthesw`p(R)parametersoftheirN-sampleiidrealizations.Inthispaper,wexr=1(cf.Section3);hence,theexamplepdf'sarecompressiblepriorswheneverp1.In(9),wemakeitexplicitthatthesw`p(R)parameterscandependontheparametersofthespeciccompressibleprioraswellasthesignaldimensionN.ThedependenceoftheparameterponNisofparticularinterestsinceithasimportantimplicationsinsignalrecoveryaswellasparameterlearningfromnitesamplesizes,asdiscussedbelow.Wedenenaturalp-compressiblepriorsasthesetNpofcompressiblepriorssuchthatp=p()1isindependentofN,8f(x;)2Np.Itispossibletoprovethatwecancapturemostofthe`1-energyinanN-sampleiidrealizationfromanaturalp-compressiblepriorbyusingaconstantK,i.e.,kx�xKk1kxk1foranydesired01bychoosingK=d(p=)p 1�pe.Hence,N-sampleiidsignalrealizationsfromthecompressiblepriorsinNpcanbetrulyembeddedintodimensionsMthatgrowlogarithmicallywithNwithtractabledecodingguaranteesdueto(3).NpmembersincludethegeneralizedPareto(GPD),Fr´echet(FD),andlog-logisticdistributions(LLD).ItthenonlycomesasasurprisethatgeneralizedGaussiandistribution(GGD)isnotanaturalp-compressiblepriorsinceitsiidrealizationslosetheircompressibilityasNgrows(cf.Table4).WhileitiscommonpracticetouseaGGDpriorwithq1forsparsesignalrecovery,wehavenorecov-eryguaranteesforsignalsgeneratedfromGGDwhenMgrowslogarithmicallywithNin(1).1Infact,tobep-compressible,theshapeparameterofaGGDpriorshouldsatisfyq=NeW�1(�p=N),whereW�1()istheLambertW-functionwiththealternatebranch.Asaresult,thelearnedGGDparametersfromdimensionality-reduceddatawillingeneraldependonthedimensionandmaynotgeneralizetootherdimensions.AlongwithGGD,Table4showshowWeibull,gamma,andlog-normaldistributionsaredimension-restrictedintheirmembershiptothesetofcompressiblepriors.Waveletcoefcientsofnaturalimagesprovideastylizedexampletodemonstratewhyweshouldcareaboutthedimensionalindependenceoftheparameterp.2Asabriefbackground,werstnotethatresearchinnaturalimagemodelingtodatehashadtwodistinctapproaches,withonefocus-ingondeterministicexplanationsandtheotherpursuingprobabilisticmodels[12].DeterministicapproachesoperateundertheassumptionthatthenaturalimagesbelongtoBesovspaces,havingaboundednumberofderivativesbetweenedges.Unsurprisingly,waveletthresholdingisprovennear-optimalforrepresentinganddenoisingBesovspaceimages.Asthesimplestexample,themagnitudesorteddiscretewaveletcoefcientsw(i)ofaBesovq-imageshouldsatisfyw(i)=Ri�1=q.Theprobabilisticapproaches,ontheotherhand,exploitthepower-lawdecayofthepowerspectraofim-agesandtvariouspdf's,suchasGGDandtheGaussianscalemixtures,tothehistogramsofwavelet 1ToillustratetheissueswiththecompressibilityofGGD,considertheLaplaciandistribution(LD:GGDwithq=1),whichistheconventionalconvexpriorforpromotingsparsity.Viaorderstatistics,itispossibletoshowthatx(i)logN iforxiGGD(1;).Withoutlossofgenerality,letusjudiciouslypick=1=logNsothatR=1.Then,wehavekxk1N�1andkx�xKk1N�Klog(N=K)�K.WhenweonlyhaveKtermstocapture(1�)ofthe`1energy(1)inthesignalx,weneedK(1�p )N.2Here,weassumethatthereaderisfamiliarwiththediscretewavelettransformanditsproperties[11].4 f(xja)GGD(x;s;a�1).Marginalizingafromf(xja),wereachtheSMDasthetrueunderlyingdistributionofx.SMDariseinmultiplecontexts,suchastheSLBframeworkthatexploitStudent'st(i.e.,s=2)forlearningproblems[2],andtheLaplacianandGaussianscalemixtures(i.e.,s=1and2,respectively)thatmodelnaturalimages[17,18].Duetolackofspace,weonlyfocusonnoiselessobservationsin(1).WeassumethatxisanN-sampleiidrealizationfromSMD(x;q;;s)withknownparameters(q;;s)0andchooseasolu-tionbxthatmaximizestheSMDlikehoodtondthetruevectorxamongthekernelof:bx=maxx0SMD(x;q;;s)=minx0Xilog�1+�sjxijs;s.t.y=x0:(12)Themajorization-minimizationtrickinSection5.1alsocircumventsthenon-convexityin(12):bxfkg=minx0Xiwi;fkgjxijs;s.t.y=x0;wherewi;fkg=�s+ xi;fkg s�1:(13)Thedecodingschemein(13)iswell-knownastheiterativere-weighted`salgorithms[7,19–21].6ParameterLearningforCompressibleDistributionsWhilederivingdecodingalgorithmsinSection5,weassumedthatthesignalcoefcientsxiaregeneratedfromacompressiblepriorf(x;)andthatisknown.Wenowrelaxthelatterassumptionanddiscusshowtosimultaneouslyestimatexandlearntheparameters.Whenwevisualizethejointestimationofxandfromyin(1)asagraphicalmodel,weimme-diatelyrealizethatxcreatesaMarkovblanketfor.Hence,todetermine,wehavetoestimatethesignalcoefcients.Whenhasthestableembeddingproperty,weknowthatthedecodingal-gorithmscanobtainxwithapproximationguarantees,suchas(3).Then,givenx,wecanchooseanestimatorforviastandardBayesianinferencearguments.Unfortunately,thisargumentleadstooneimportantroadblock:estimationofthesignalxwithoutknowingthepriorparameters.An¨aiveapproachtoovercomingthisroadblockistosplittheoptimizationspaceandalternateonxandwhileoptimizingtheBayesianobjective.Unfortunately,thereisoneimportantandunrecognizedbuginthisargument:theestimatedsignalvaluesareingeneralnotiid,hencewewouldbeminimizingthewrongBayesianobjectivetodetermine.Toseethis,werstnotethattherecoveredsignalsbxingeneralconsistofMNnon-zerocoefcientsthatmimicthebestK-termapproximationofthesignalxKandsomeothercoefcientsthatexplainthesmalltailenergy.WethenrecallfromSection(3)thatthecoefcientsofxKarestatisticallydependent.Hence,atleastpartially,thesignicantcoefcientsofbxarealsodependent.OnewaytoovercomethisdependencyissueistotreattherecoveredsignalsasiftheyaredrawniidfromacensoredGPD.However,theoptimizationbecomescomplicatedandtheapproachdoesnotprovideanyadditionalguarantees.Asanalternative,weproposetoexploitgeometryandusetheconsensusamongthecoefcientsinttingthesw`p(R)parametersviatheauxiliarysignalestimatesbxfkgduringiterativerecovery.Todothis,weemployFischlerandBolles'probabilisticrandomsamplingconsensus(RANSAC)algorithm[22]totaline,whosey-interceptislogR(N;)andwhoseslopeis1=p(N;):log bxi;fkg =logR(N;)�1 p(N;)logi;fori=1;:::;K;whereK=M=[Clog(N=M)];(14)whereC4;5asdiscussedinSection.3.RANSACprovidesexcellentresultswithhighprobabilityevenifthedatacontainssignicantoutliers.Becauseofitsprobabilisticnature,itiscomputationallyefcient.TheRANSACalgorithmrequiresathresholdtogatetheobservationsandcounthowmuchaproposedsolutionissupportedbytheobservations[22].WedeterminethisthresholdbyboundingthetailprobabilitythattheOSofacompressiblepriorwillbeoutofbounds.Forthepseudo-codeandfurtherdetailsoftheRANSACalgorithm,cf.[22].7Experiments7.1OrderStatisticsTodemonstratethesw`p(R)decayproleofp-compressiblepriors,wegeneratediidrealizationsofGGDwithq=1(LD)andGPDwithq=1,and(non-iid)realizationsofMLDwithq=1ofvaryingsignaldimensionsN=10j,wherej=2;3;4;5.Wesortedthemagnitudesofthesignalcoefcients,normalizedthembytheircorrespondingvalueofR.Wethenplottedtheresultsonalog-logscaleinFig.1.Athttp://dsp.rice.edu/randcs,weprovideaMATLABroutine(randcs.m)sothatitiseasytorepeatthesameexperimentfortherestofthedistributionsinTable4.6 (a) (b) (c)Figure3:Improvementsaffordedbyre-weighted`1-decoding(a)withknownparametersand(b)withlearning.(c)Thelearnedsw`pexponentoftheGPDdistributionwithq=0:4viatheRANSACalgorithm.cientsofnaturalimagesthathavealmostnoapproximationpoweroftheoverallimage.Moreover,thelearnedGGDdistributionisdimensiondependent,assignslowerprobabilitytothelargecoef-cientsthatexplaintheimagewell,andpredictsamismatchedOSofnaturalimages(cf.Fig.2(b)).Figure2(c)comparesthemagnitudeorderedpixelgradientsoftheimages(solidlines)withtheexpectedOSofGGD(dashedline).Fromthegure,itappearsthatthenaturalimagepixelgradientslosetheircompressibilityastheimagedimensionsgrow,similartotheGGD,Weibull,gamma,andlog-normaldistributions.Inthegure,theGGDparametersaregiven(q;)=(0:95;25).7.3Iterative`1DecodingWerepeatthecompressiblesignalrecoveryexperimentinSection3.2of[7]todemonstratetheperformanceofouriterative`sdecoderwiths=1in(13).Werstrandomlysampleasignalx2RN(N=256)wherethesignalcoefcientsareiidfromtheGPDdistributionwithq=0:4and=(N+1)�1=qsothattheE[x(1)]1.WesetM=128anddrawarandomMNmatrixwithiidGaussianentriestoobtainy=x.Wethendecodesignalsvia(13)wheremaximumiterationsissetto5,withtheknowledgeofthesignalparametersandwithlearning.Duringthelearningphase,weuselog(2)asthethresholdfortheRANSACalgorithm.WesetthemaximumiterationcountofRANSACto500.TheresultsofaMonteCarlorunwith100independentrealizationsareillustratedinFig.3.InFigs.3(a)and(b),theplotssummarizetheaverageimprovementoverthestandarddecoder1(y)viathehistogramsofkx�bxf4gk2=kx�1(y)k2,whichhavemeanandstandarddeviation(0:7062;0:1380)whenweknowtheparametersoftheGPD(a)and(0:7101;0:1364)whenwelearntheparametersoftheGPDviaRANSAC(b).Thelearnedsw`pexponentissummarizedbythehis-tograminFig.3(c),whichhasmeanandstandarddeviation(0:3757;0:0539).Hence,weconcludethattheouralternativelearningapproachviatheRANSACalgorithmiscompetitivewithknowingtheactualpriorparametersthatgeneratedthesignal.Moreover,thecomputationaltimeoflearningisinsignicantcomparedtotimerequiredbythestate-of-theartlinearSPGLalgorithm[24].8Conclusions3Compressiblepriorscreateaconnectionbetweenprobabilisticanddeterministicmodelsforsignalcompressibility.Thebridgebetweentheseseeminglytwodifferentmodelingframeworksturnsouttobetheconceptoforderstatistics.Wedemonstratedthatwhenthep-parameterofacompressiblepriorisindependentoftheambientdimensionN,itispossibletohavetrulylogarithmicembeddingofitsiidsignalrealizations.Moreover,thelearnedparametersofsuchcompressiblepriorsaredi-mensionagnostic.Incontrast,weshowedthatwhenthep-parameterdependsonN,wehavemanyrestrictionsinsignalembeddingandrecoveryaswellasinparameterlearning.WeillustratedthatwaveletcoefcientsofnaturalimagescanbewellapproximatedbythegeneralizedParetoprior,whichinturnpredictsadisappointingapproximationrateforimagecodingwiththenaivesparsemodelandforCSimagerecoveryfrommeasurementsthatgrowonlylogarithmicallywiththeim-agedimension.Wemotivatedmanyoftheexistingsparsesignalrecoveryalgorithmasinstancesofacorrespondingcompressibleprioranddiscussedparameterlearningforthesepriorsfromdimen-sionalityreduceddata.Wehopethattheiidcompressibilityviewtakeninthispaperwillpavethewayforabetterunderstandingofprobabilisticnon-iidandstructuredcompressibilitymodels. 3WethankR.G.Baraniuk,M.Wakin,M.Davies,J.Haupt,andJ.P.Slavinksyforusefuldiscussions.SupportedbyONRN00014-08-1-1112,DARPAN66001-08-1-2065,AROW911NF-09-1-0383grants.8 References[1]E.J.Candes.Compressivesampling.InProc.InternationalCongressofMathematicians,volume3,pages1433–1452,Madrid,Spain,2006.[2]M.E.Tipping.Sparsebayesianlearningandtherelevancevectormachine.TheJournalofMachineLearningResearch,1:211–244,2001.[3]D.P.WipfandB.D.Rao.SparseBayesianlearningforbasisselection.IEEETransactionsonSignalProcessing,52(8):2153–2164,2004.[4]T.BlumensathandM.E.Davies.Samplingtheoremsforsignalsfromtheunionoflinearsubspaces.IEEETrans.Info.Theory,2009.[5]A.Cohen,W.Dahmen,andR.DeVore.Compressedsensingandbestk-termapproximation.AmericanMathematicalSociety,22(1):211–231,2009.[6]I.F.Gorodnitsky,J.S.George,andB.D.Rao.NeuromagneticsourceimagingwithFO-CUSS:arecursiveweightedminimumnormalgorithm.Electroenceph.andClin.Neurophys.,95(4):231–251,1995.[7]E.J.Candes,M.B.Wakin,andS.P.Boyd.Enhancingsparsitybyreweighted`1minimization.JournalofFourierAnalysisandApplications,14(5):877–905,2008.[8]D.P.WipfandS.Nagarajan.Iterativereweighted`1and`2methodsforndingsparsesolu-tions.InSPARS09,Rennes,France,2009.[9]S.S.Chen,D.L.Donoho,andM.A.Saunders.Atomicdecompositionbybasispursuit.SIAMreview,pages129–159,2001.[10]H.A.DavidandH.N.Nagaraja.OrderStatistics.Wiley-Interscience,2004.[11]S.Mallat.AWaveletTourofSignalProcessing.AcademicPress,1999.[12]H.ChoiandR.G.Baraniuk.WaveletstatisticalmodelsandBesovspaces.LectureNotesinStatistics,pages9–30,2003.[13]T.K.Nayak.MultivariateLomaxdistribution:propertiesandusefulnessinreliabilitytheory.JournalofAppliedProbability,pages170–177,1987.[14]V.Cevher.Compressiblepriors.IEEETrans.onInformationTheory,inpreparation,2010.[15]R.Tibshirani.Regressionshrinkageandselectionviathelasso.JournaloftheRoyalStatisticalSociety,pages267–288,1996.[16]E.T.Hale,W.Yin,andY.Zhang.Fixed-pointcontinuationfor`1-minimization:Methodologyandconvergence.SIAMJournalonOptimization,19:1107,2008.[17]P.J.Garrigues.SparseCodingModelsofNaturalImages:AlgorithmsforEfcientInferenceandLearningofHigher-OrderStructure.PhDthesis,EECSDepartment,UniversityofCali-fornia,Berkeley,May2009.[18]M.J.WainwrightandE.P.Simoncelli.ScalemixturesofGaussiansandthestatisticsofnaturalimages.InNIPS,2000.[19]D.WipfandS.Nagarajan.Anewviewofautomaticrelevancedetermination.InNIPS,vol-ume20,2008.[20]I.Daubechies,R.DeVore,M.Fornasier,andS.Gunturk.Iterativelyre-weightedleastsquaresminimizationforsparserecovery.Commun.PureAppl.Math,2009.[21]R.ChartrandandW.Yin.Iterativelyreweightedalgorithmsforcompressivesensing.InICASSP,pages3869–3872,2008.[22]M.A.FischlerandR.C.Bolles.Randomsampleconsensus:aparadigmformodelttingwithapplicationstoimageanalysisandautomatedcartography.CommunicationsoftheACM,24(6):381–395,1981.[23]E.J.CandesandD.L.Donoho.Curveletsandcurvilinearintegrals.JournalofApproximationTheory,113(1):59–90,2001.[24]E.vandenBergandM.P.Friedlander.ProbingtheParetofrontierforbasispursuitsolutions.SIAMJournalonScienticComputing,31(2):890–912,2008.9