/
Large deviations weak convergence and relative entropy Markus Fischer University of Padua Large deviations weak convergence and relative entropy Markus Fischer University of Padua

Large deviations weak convergence and relative entropy Markus Fischer University of Padua - PDF document

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
544 views
Uploaded On 2014-12-24

Large deviations weak convergence and relative entropy Markus Fischer University of Padua - PPT Presentation

Essential tools for large deviations analysis weak convergence of probability measures Section 3 and relative entropy Section 4 Weak convergence especially useful in the Dupuis and Ellis 1997 approach see lectures Table 1 Notation a topological spa ID: 28964

Essential tools for large

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Large deviations weak convergence and re..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Ckc(Rd) spaceofallcontinuousfunctionsRd!Rwithcompactsupportandcontinuouspartialderivativesuptoorderk T asubsetofR,usually[0;T]or[0;1) C(T:X) spaceofallcontinuousfunctionsT!X D(T:X) spaceofallcàdlàgfunctionsT!X(i.e.,functionscon-tinuousfromtherightwithlimitsfromtheleft) ^ minimum(asbinaryoperator) _ maximum(asbinaryoperator) 2LargedeviationsAstandardtextbookonthetheoryoflargedeviationsisDemboandZeitouni[1998];alsoseeEllis[1985],DeuschelandStroock[1989],DupuisandEllis[1997],denHollander[2000],andthereferencestherein.AfoundationalworkinthetheoryisVaradhan[1966].2.1CointossingConsiderthefollowingrandomexperiments:Givenanumbern2N,tossncoinsofthesametypeandcountthenumberofcoinsthatlandheadsup.Denotethat(random)numberbySn.ThenSn=nistheempiricalmean,hereequaltotheempiricalprobability,ofgettingheads.WhatcanbesaidaboutSn=nfornlarge?Toconstructamathematicalmodelforthecointossingexperiments,letX1;X2;:::bef0;1g-valuedindependentandidenticallydistributed(i.i.d.)randomvariablesdenedonsomeprobabilityspace( ;F;P).ThuseachXihasBernoullidistributionwithparameterp:=P(X1=1).InterpretXi(!)=1assayingthatcoiniatrealization!2 landsheadup.ThenSn=Pni=1Xi.Bythestrong/weaklawoflargenumbers(LLN),Sn nn!1�!pwithprobabilityone/inprobability.Inparticular,bytheweaklawoflargenumbers,forall"�0,PfSn=n�p"gn!1�!0:Morecanbesaidabouttheasymptoticbehaviorofthosedeviationproba-bilities.ObservethatSnhasbinomialdistributionwithparameters(n;p),2 2.2ThelargedeviationprincipleThetheoryoflargedeviationswillbedevelopedinthiscourseforrandomvariablestakingvaluesinaPolishspace.APolishspaceisaseparabletopologicalspacethatiscompatiblewithacompletemetric.ExamplesofPolishspacesareRdwiththestandardtopology,anyclosedsubsetofRd(oranotherPolishspace)equippedwiththeinducedtopology,thespaceC(T:X)ofcontinuousfunctions,T(�1;1)aninterval,Xacompleteandseparablemetricspace,equippedwiththetopologyofuniformconvergenceoncompactsubsetsofT,thespaceD(T:X)ofcàdlàgfunctions,T(�1;1)aninterval,aXacompleteandseparablemetricspace,equippedwiththeSkorohodtopology[e.g.Billingsley,1999,Chapter3],thespaceP(X)ofprobabilitymeasuresonB(X),XaPolishspace,equippedwiththeweakconvergencetopology(cf.Section3).Let(n)n2NbeafamilyofrandomvariableswithvaluesinaPolishspaceS.LetI:S![0;1]beafunctionwithcompactsublevelsets,i.e.,fx2S:I(x)cgiscompactforeveryc2[0;1).Suchafunctionislowersemicontinuousandiscalleda(good)ratefunction.Denition2.1.Thesequence(n)n2NsatisesthelargedeviationprinciplewithratefunctionIiforallG2B(S),�infx2GI(x)liminfn!11 nlogPfn2Gglimsupn!11 nlogPfn2Gg�infx2cl(G)I(x):Thelargedeviationprincipleisadistributionalproperty:WritingPnforLaw(n),(n)satisesthelargedeviationprinciplewithratefunctionIifandonlyifforallG2B(S),�infx2GI(x)liminfn!11 nlogPn(G)limsupn!11 nlogPn(G)�infx2cl(G)I(x):4 1.The(good)ratefunctionofalargedeviationprincipleisuniquelyde-termined.2.IfIistheratefunctionofalargedeviationprinciple,theninfx2SI(x)=0andI(x)=0forsomex2S.IfIhasauniqueminimizer,thenthelargedeviationprincipleimpliesacorrespondinglawoflargenumbers.3.ThelargedeviationprincipleholdsifandonlyiftheLaplaceprincipleholds,andthe(good)ratefunctionisthesame.4.Contractionprinciple:LetYbeaPolishspaceand :S!Ybeameasurablefunction.If(n)satisesthelargedeviationprinciplewith(good)ratefunctionIandif iscontinuousonfx2S:I(x)1g,then( (n))satisesthelargedeviationprinciplewith(good)ratefunctionJ(y):=infx2 �1(y)I(x).2.3LargedeviationsforempiricalmeansLetX1;X2;:::beR-valuedi.i.d.randomvariableswithcommondistributionsuchthatm:=Rx(dx)=E[X1]isnite.Asinthecaseofcoinipping,setSn:=Pni=1XiandconsidertheasymptoticbehaviorofSn=n,n2N.Bythelawoflargenumbers,Sn=nn!1�!mwithprobabilityone.Letbethemomentgeneratingfunctionof(orX1;X2;:::),thatis,(t):=ZRetx(dx)=EetX1;t2R:Theorem2.1(Cramér).Supposethatissuchthat(t)isniteforallt2R.Then(Sn=n)n2NsatisesthelargedeviationprinciplewithratefunctionIgivenbyI(x):=supt2Rftx�log((t))g:TheratefunctionIinTheorem2.1istheLegendretransformoflog,thelogarithmicmomentgeneratingfunctionorcumulantgeneratingfunctionofthecommondistribution.Twoparticularcases:1.Bernoullidistribution:theBernoullidistributiononf0;1gwithpa-rameterp.Then(t)=1�p+petandI(x)=xlogx p+(1�x)log1�x 1�p;x2[0;1];I(x)=1forx2Rn[0;1],asinExample1.6 whichyieldsthelargedeviationupperboundforclosedsetsoftheform[x;1),x�0.Acompletelyanalogousargumentgivestheupperboundforclosedsetsoftheform(�1;x],x0.LetGRbeaclosedsetnotcontainingzero.Then,thankstothestrictconvexityofIandthefactthatI0andI(0)=0,therearex+&#x]TJ/;༕ ;.9;‘ ;&#xTf 2;.42; 0 ;&#xTd [;0,x�0suchthatG(�1;x�][[x+;1)andinfx2GI(x)=infx2(�1;x�][[x+;1)I(x).Thisestablishesthelargedeviationupperbound.Toobtainthelargedeviationlowerbound,werstconsideropensetsoftheform(x�;x+)forx2(essinfX1;esssupX1),&#x]TJ/;༕ ;.9;‘ ;&#xTf 1;.37; 0 ;&#xTd [;0.Fixsuchx,.Sinceiseverywherenitebyhypothesis,thuslogcontinuouslydierentiableandstrictlyconvexonR,thereexistsauniquesolutiontx2Rtotheequationx=(log)0(tx)=0(tx) (tx)=EX1etxX1 E[etxX1];andI(x)=txx�log(tx):Deneaprobabilitymeasure~2P(R)absolutelycontinuouswithrespecttoaccordingtod~ d(y):=exp(txy�log(tx))=etxy (tx):LetY1;Y2;:::bei.i.d.randomvariableswithcommondistribution~,andset~Sn:=Pni=1Yi,n2N.ThenE[Yi]=xand,for"�0,PfSn=n2(x�";x+")g=Zfz2Rn:Pni=1zi2(n(x�");n(x+"))g n(dy)e�n(tx(x�")_tx(x+"))Zfz2Rn:Pni=1zi2(n(x�");n(x+"))getxPni=1yi n(dy)=enlog(tx)e�n(tx(x�")_tx(x+"))Pn~Sn=n2(x�";x+")o:SinceY1;Y2;:::arei.i.d.withE[Yi]=x,wehave~Sn=n!xinprobabilityasn!1bytheweaklawoflargenumbers.Itfollowsthatliminfn!11 nlogPfSn=n2(x�";x+")glog(tx)�tx(x�")_tx(x+"):Since"�0wasarbitraryandlog(tx)�txx=�I(x),weobtainliminfn!11 nlogPfSn=n2(x�;x+)g�I(x):8 Example3(Normaldistributions).Let(mn)n2NR,(2n)n2N[0;1)beconvergentsequenceswithlimitsmand2,respectively.Thenthese-quenceofnormaldistributions(N(mn;2n))n2NP(R)convergesweaklytoN(m;2).Example4(Productmeasures).Suppose(n)n2NP(X),(n)n2NP(Y)areweaklyconvergentsequenceswithlimitsand,respectively,whereYisPolish,too.Thenthesequenceofproductmeasures(n n)n2NP(XY)convergesweaklyto .Example5(Marginalsvs.jointdistribution).LetX,Y,Zbeindependentreal-valuedrandomvariablesdenedonsomeprobabilityspace( ;F;P)suchthatLaw(X)=N(0;1)=Law(Y)(i.e.,X,Yhavestandardnor-maldistribution),andZhasRademacherdistribution,thatis,P(Z=1)=1=2=P(Z=�1).Deneasequence(n)n2NP(R2)by2n:=Law(X;Y),2n�1:=Law(X;ZX),n2N.ObservethattherandomvariableZXhasstandardnormaldistributionN(0;1).Themarginaldistributionsof(n)thereforeconvergeweakly(theyareconstantandequaltoN(0;1)),whilethesequence(n)itselfdoesnotconverge(indeed,thejointdistributionofXandZXisnotevenGaussian).Toproveweakconvergenceofprobabilitymeasuresonaproductspaceitisthereforenotenoughtocheckconvergenceofthemarginaldistributions.Thissucesonlyinthecaseofproductmea-sures;cf.Example4.ThelimitofaweaklyconvergentsequenceinP(X)isunique.Weakcon-vergenceinducesatopologyonP(X);underthistopology,XbeingaPolishspace,P(X)isaPolishspace,too.LetdbeacompletemetriccompatiblewiththetopologyofX;thus(X;d)isacompleteandseparablemetricspace.TherearedierentchoicesforacompletemetriconP(X)thatiscompatiblewiththetopologyofweakconvergence.TwocommonchoicesarethePro-horovmetricandtheboundedLipschitzmetric,respectively.TheProhorovmetriconP(X)isdenedby(;):=inff"�0:(G)(G")+"forallclosedGXg;(3.1)whereG":=fx2X:d(x;G)"g.Noticethatisindeedametric.TheboundedLipschitzmetriconP(X)isdenedby~(;):=sup Zfd�Zfd :f2Cb(X)suchthatkfkbL1;(3.2)10 If(@B)=0,then(B)=(cl(B)),hencelimn!1n(B)=(B).(vii))(viii).Letf2Mb(X)besuchthat(Uf)=0,andletA:=fy2R:(f�1fyg)�0gbethesetofatomsoff�1.Sincef�1isanitemeasure,Aisatmostcountable.Let"�0.ThenthereareN2N,y0;:::;yN2RnAsuchthaty0�kfk1y1:::yN�1kfk1yN;jyi�yi�1j":Fori2f1;:::;NgsetBi:=f�1f[yi�1;yi)g.Then(@Bi)(f�1fyi�1g)+(f�1fyig)+fUfg=0:Using(vii)weobtainlimsupn!1Zfdnlimsupn!1NXi=1n(Bi)yi=NXi=1(Bi)yi"+Zfd:Since"wasarbitrary,itfollowsthatlimsupn!1RfdnRfd.Thesameargumentappliedto�fyieldstheinequalityliminfn!1RfdnRfd.Theimplication(viii))(i)isimmediate. FromDenition3.1itisclearthatweakconvergenceispreservedundercontinuousmappings.Themappingtheoremforweakconvergencerequirescontinuityonlywithprobabilityonewithrespecttothelimitmeasure;thisshouldbecomparedtocharacterization(vii)inTheorem3.1.Theorem3.2(Mappingtheorem).Let(n)n2NP(X),2P(X).LetYbeasecondPolishspace,andlet :X!Ybeameasurablemapping.Ifnw�!andfx2X: discontinuousatxg=0,then nw�! .Proof.Bypart(v)ofTheorem3.1,itisenoughtoshowthatforeveryOYopen,liminfn!1n� �1(O)� �1(O):LetOYbeopen.SetC:=fx2X: continuousatxg.Then �1(O)\Ciscontainedin( �1(O)),theinteriorof �1(O).Since( �1(O))isopen,(C)=1andnw�!byhypothesis,itfollowsfrompart(v)ofTheorem3.1that� �1(O)=�( �1(O))liminfn�( �1(O))liminfn!1n� �1(O): 12 Byhypothesis,(Xn)n2Nisuniformlyintegrable,thatis,limM!1supn2NEnjXnj1fjXnjMg=0:Thisimplies,inparticular,thatsupn2NEn[jXnj]1.By(3.3a),forM2NwecanchoosenM2NsuchthatjEnM[jXnMj^M]�E[jXj^M]j1.ItfollowsthatsupM2NE[jXj^M]supM2NjEnM[jXnMj^M]�E[jXj^M]j+supn2NEn[jXnj^M]1+supn2NEn[jXnj]1:ThisshowsE[jXj]1,thatis,Xisintegrable,sinceE[jXj^M]%E[jXj]asM!1bymonotoneconvergence.Nowforanyn2N,anyM2N,jE[X]�En[Xn]jjE[�M_(X^M)]�En[�M_(Xn^M)]j+EjXj1fjXjMg+EnjXnj1fjXnjMgLet"&#x]TJ/;༣ ;.9;‘ ;&#xTf 1;.51; 0 ;&#xTd [;0.BytheintegrabilityofXandtheuniformintegrabilityof(Xn)onendsM"2NsuchthatEjXj1fjXjMg+supk2NEkjXkj1fjXkjMg" 2:By(3.3a),wecanchoosen"=n(M")suchthatjE[�M"_(X^M")]�En"[�M"_(Xn"^M")]j" 2:ItfollowsthatjE[X]�En"[Xn"]j",whichestablishesthedesiredconver-gencesince"wasarbitrary. Supposewehaveasequence(Xn)n2NofrandomvariablesthatconvergesindistributiontosomerandomvariableX;thusLaw(Xn)w�!Law(X).Iftherelation(inparticular,jointdistribution)betweentheX1;X2;:::isirrelevant,onemayworkwithrandomvariablesthatconvergealmostsurely.Theorem3.5(Skorohodrepresentation).Let(n)n2NP(X).Ifnw�!forsome2P(X),thenthereexistsaprobabilityspace( ;F;P)carryingX-valuedrandomvariablesXn,n2N,andXsuchthatPX�1n=nforeveryn2N,PX�1=,andXn!Xasn!1P-almostsurely.14 Theorem3.7.LetIbeanon-emptyset,andlet(i)i2IP(C([0;1);Rd)).Then(i)i2Iisrelativelycompactifandonlyifthefollowingtwoconditionshold:(i)(i(X(0))�1istightinP(Rd),and(ii)forevery"�0,everyT2Nthereis�0suchthatsupi2Iin!2C([0;1);Rd):wT(!;)�"o";wherewT(!;):=sups;t2[0;T]:jt�sjj!(t)�!(s)jisthemodulusofcontinuityof!withsizeoverthetimeinterval[0;T].Proof.See,forinstance,Theorem2.7.3inBillingsley[1999,pp.82-83];theextensionfromacompacttimeintervalto[0;1)isstraightforward.1 Theorem3.7shouldbecomparedtotheArzelà-Ascolicriterionforrel-ativecompactnessinC([0;1);Rd).Thenexttheoremgivesasucientconditionforrelativecompactness(ortightness)inP(C([0;1);Rd));theresultshouldbecomparedtotheKolmogorov-Chentsovcontinuitytheorem.Theorem3.8(Kolmogorov'ssucientcondition).LetIbeanon-emptyset,andlet(i)i2IP(C([0;1);Rd)).Supposethat(i)(i(X(0))�1istightinP(Rd),and(ii)therearestrictlypositivenumbersC, , suchthatforallt;s2[0;1),alli2I,Ei[jX(s)�X(t)j ]Cjt�sj1+ :Then(i)i2IisrelativelycompactinP(C([0;1);Rd)).Proof.See,forinstance,Corollary16.5inKallenberg[2001,p.313]. Tightnessofafamily(i)i2IP(S)canoftenbeestablishedbyshowingthatsupi2IG(i)1foranappropriatefunctionG.ThisworksifGisatightnessfunctioninthesenseofthedenitionbelow. 1TheissueismoredelicatewhenpassingfromtheSkorohodspaceD([0;T];X)totheSkorohodspaceD([0;1);X);cf.Chapter3inBillingsley[1999].16 Clearly,limx!0+xlog(x)=0.SinceRfd=1andxlog(x)x�1forallx0withequalityifandonlyifx=1,itfollowsthatR(k)0withR(k)=0ifandonlyif=.Relativeentropycanactuallybedenedfor-nitemeasuresonanarbitrarymeasurablespace.Lemma4.1(Basicproperties).PropertiesofrelativeentropyR(:k:)forprobabilitymeasuresonaPolishspaceS.(a)Relativeentropyisanon-negative,convex,lowersemicontinuousfunc-tionP(S)P(S)![0;1].(b)For2P(S),R(:k)isstrictlyconvexonf2P(S:R(k)1g.(c)For2P(S),R(:k)hascompactsublevelsets.(d)LetSdenotethesetofnitemeasurablepartitionsofS.Thenforall;2P(S),R(k)=sup2SXA2(A)log(A) (A);wherexlog(x=y)=0ifx=0,xlog(x=y)=1ifx�0andy=0.(e)ForeveryA2B(S),any;2P(S),R(k)(A)log(A) (A)�1:Proof.SeeLemma1.4.3,parts(b),(c),(g),inDupuisandEllis[1997,pp.29-30]. Lemma4.2(Contractionproperty).Let :Y!XbeaBorelmeasurablemapping.Let2P(X), 02P(Y).ThenR�k 0 �1=inf 2P(Y):  �1=R� k 0;(4.1)whereinf;=1byconvention.Proof(sketch).InequalityanalogoustoproofofLemmaE.2.1inDupuisandEllis[1997,p.366].Fortheoppositeinequality,checkthattheprobabil-itymeasure denedby (dy):=d d 0 �1( (y)) 0(dy)attainstheinmumwheneverthatinmumisnite.2 2Formoredetailsandanapplicationsee,forinstance,M.Fischer,Ontheformofthelargedeviationratefunctionfortheempiricalmeasuresofweaklyinteractingsystems,arXiv:1208.0472[math.PR].18 withdensityd d,butalsoabsolutelycontinuouswithrespecttowithdensityd d=d dd d,whered d=eg Regd.ItfollowsthatR(k)+ZSgd=ZSlogd dd+ZSgd=ZSlogd dd+ZSlogd dd+ZSgd=R(k)�logZSe�gd:ThisyieldstheassertionsinceR(k)0withR(k)=0ifandonlyif=. Lemma4.4alsoallowstoderivetheDonsker-Varadhanvariationalfor-mulaforrelativeentropyitself.Lemma4.5(Donsker-Varadhan).Let;2P(S).ThenR(k)=supg2Mb(S)ZSg(x)(dx)�logZSexp(g(x))(dx):Proof.Let;2P(S).ByLemma4.4,foreveryg2Mb(S)R(k)�ZSgd�logZSe�gd;henceR(k)supg2Mb(S)�ZSgd�logZSe�gd=supg2Mb(S)ZSgd�logZSegd:Forg2Mb(S)setJ(g):=RSgd�logRSegd.ThusR(k)supg2MbJ(g).Toobtainequality,itisenoughtondasequence(gM)M2NMb(S)suchthatlimsupM!1J(gM)=R(k).Wedistinguishtwocases.Firstcase:isnotabsolutelycontinuouswithrespectto.ThenR(k)=1andthereexistsA2B(S)suchthat(A)�0while(A)=0.ChoosesuchasetAandsetgM:=M1A.Then,foreveryM2N,gM=0-almostsurely,thusRegMd=Re0d=1,hencelogRegMd=0.ItfollowsthatlimsupM!1J(gM)=limsupM!1ZSgMd=limsupM!1M(A)=1:20 Remark4.2.Lemmata4.4and4.5implyarelationshipofconvexdualitybetweenLaplacefunctionalsandrelativeentropy.Let2P(S).ThenR(k)=supg2Mb(X)ZSgd�logZSegd;2P(S);logZSegd=sup2P(S)ZSgd�R(k);g2Mb(S);thatis,thefunctions7!R(k)andg7!logRegdareconvexconjugates.ReferencesP.Billingsley.ConvergenceofProbabilityMeasures.WileyseriesinProba-bilityandStatistics.JohnWiley&Sons,NewYork,1968.P.Billingsley.ConvergenceofProbabilityMeasures.WileyseriesinProba-bilityandStatistics.JohnWiley&Sons,NewYork,2ndedition,1999.A.DemboandO.Zeitouni.LargeDeviationsTechniquesandApplications,volume38ofApplicationsofMathematics.Springer,NewYork,2ndedi-tion,1998.F.denHollander.LargeDeviations,volume14ofFieldsInstituteMono-graphs.AmericanMathematicalSociety,Providence,RI,2000.J.-D.DeuschelandD.W.Stroock.LargeDeviations.AcademicPress,Boston,1989.R.Dudley.RealAnalysisandProbability.Cambridgestudiesinadvancedmathematics.CambridgeUniversityPress,Cambridge,2002.P.DupuisandR.S.Ellis.AWeakConvergenceApproachtotheTheoryofLargeDeviations.WileySeriesinProbabilityandStatistics.JohnWiley&Sons,NewYork,1997.R.S.Ellis.Entropy,LargeDeviationsandStatisticalMechanics,volume271ofGrundlehrendermathematischenWissenschaften.Springer,NewYork,1985.S.N.EthierandT.G.Kurtz.MarkovProcesses:CharacterizationandConvergence.WileySeriesinProbabilityandStatistics.JohnWiley&Sons,NewYork,1986.22