/
StatisticaSinica17(2007),1617-1642ASYMPTOTICSOFSAMPLEEIGENSTRUCTUREFOR StatisticaSinica17(2007),1617-1642ASYMPTOTICSOFSAMPLEEIGENSTRUCTUREFOR

StatisticaSinica17(2007),1617-1642ASYMPTOTICSOFSAMPLEEIGENSTRUCTUREFOR - PDF document

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
435 views
Uploaded On 2015-10-29

StatisticaSinica17(2007),1617-1642ASYMPTOTICSOFSAMPLEEIGENSTRUCTUREFOR - PPT Presentation

r21p r2andwhenr1anisolatedpointzeroisaddedtothesupportItisknownBaiandYin1993thatwhenthepopulationcovarianceistheidentitythelargestandthesmallesteigenvalueswhenr1convergealmostsur ID: 176440

\r)2;(1+p \r)2] andwhen\r1anisolatedpointzeroisaddedtothesupport.Itisknown(BaiandYin(1993))thatwhenthepopulationcovarianceistheidentity thelargestandthesmallesteigenvalues when\r1 convergeal-mostsur

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "StatisticaSinica17(2007),1617-1642ASYMPT..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

StatisticaSinica17(2007),1617-1642ASYMPTOTICSOFSAMPLEEIGENSTRUCTUREFORALARGEDIMENSIONALSPIKEDCOVARIANCEMODELDebashisPaulUniversityofCalifornia,DavisAbstract:ThispaperdealswithamultivariateGaussianobservationmodelwheretheeigenvaluesofthecovariancematrixareallone,exceptfora nitenumberwhicharelarger.Ofinterestistheasymptoticbehavioroftheeigenvaluesofthesamplecovariancematrixwhenthesamplesizeandthedimensionoftheobser-vationsbothgrowtoin nitysothattheirratioconvergestoapositiveconstant.Whenapopulationeigenvalueisaboveacertainthresholdandofmultiplicityone,thecorrespondingsampleeigenvaluehasaGaussianlimitingdistribution.Thereisa\phasetransition"ofthesampleeigenvectorsinthesamesetting.Anothercontributionhereisastudyofthesecondorderasymptoticsofsampleeigenvectorswhencorrespondingeigenvaluesaresimpleandsucientlylarge.Keywordsandphrases:Eigenvaluedistribution,principalcomponentanalysis,randommatrixtheory.1.IntroductionThestudyofeigenvaluesandeigenvectorsofsamplecovariancematriceshasalonghistory.WhenthedimensionNis xed,distributionalaspectsforbothGaussianandnon-Gaussianobservationshavebeendealtwithatlengthbyvar-iousauthors.Anderson(1963),Muirhead(1982)andTyler(1983)areamongstandardreferences.Withdimension xed,muchofthestudyoftheeigenstruc-tureofsamplecovariancematrixisbasedonthefactthatsamplecovarianceapproximatespopulationcovariancematrixwellwhensamplesizeislarge.How-everthisisnolongerthecasewhenN=n!\r2(0;1)asn!1,wherenisthesamplesize.Underthesecircumstancesitisknown(seeBai(1999)forareview)that,ifthetruecovarianceistheidentitymatrix,thentheEmpiri-calSpectralDistribution(ESD)convergesalmostsurelytotheMarcenko-Pasturdistribution,henceforthdenotedbyF\r.When\r1,thesupportF\ristheset[(1p \r)2;(1+p \r)2],andwhen\r�1anisolatedpointzeroisaddedtothesupport.Itisknown(BaiandYin(1993))thatwhenthepopulationcovarianceistheidentity,thelargestandthesmallesteigenvalues,when\r1,convergeal-mostsurelytotherespectiveboundariesofthesupportofF\r.Johnstone(2001) 1618DEBASHISPAULderivedtheasymptoticdistributionforthelargestsampleeigenvalueunderthesettingofanidentitycovarianceunderGaussianity.Soshnikov(2002)provedthedistributionallimitsunderweakerassumptions,inadditiontoderivingdistribu-tionallimitsofthekthlargesteigenvalue,for xedbutarbitraryk.However,inrecentyearsresearchersinvarious eldshavebeenusingdi erentversionsofnon-identitycovariancematricesofgrowingdimension.Amongthese,aparticularlyinterestingmodelhasmostoftheeigenvaluesone,andthefewthatarenotarewell-separatedfromtherest.Thishasbeendeemedthe\spikedpopulationmodel"byJohnstone(2001).Ithasalsobeenobservedthatforcertaintypesofdata,e.g.,inspeechrecognition(Buja,HastieandTibshirani(1995)),wirelesscommunication(Telatar(1999)),statisticallearning(HoyleandRattray(2003,2004)),afewofthesampleeigenvalueshavelimitingbehaviorthatisdi erentfromthebehaviorwhenthecovarianceistheidentity.Theresultsofthispaperlendunderstandingtothesephenomena.Theliteratureontheasymptoticsofsampleeigenvalueswhenthecovarianceisnottheidentityisrelativelyrecent.SilversteinandChoi(1995)derivedthealmostsurelimitoftheESDunderfairlygeneralconditions.BaiandSilverstein(2004)derivedtheasymptoticdistributionofcertainlinearspectralstatistics.However,asystematicstudyoftheindividualeigenvalueshasbeenconductedonlyrecentlybyPeche(2003)andBaik,BenArousandPeche(2005).TheseauthorsdealwiththesituationwheretheobservationsarecomplexGaussianandthecovariancematrixisa niterankperturbationofidentity.BaikandSilverstein(2006)studythealmostsurelimitsofsampleeigenvalueswhentheobservationsareeitherrealorcomplex,andunderfairlyweakdistributionalassumptions.TheygivealmostsurelimitsoftheMlargestandMsmallest(non-zero)sampleeigenvalues,whereMisthenumberofnon-unitpopulationeigenvalues.Acrucialaspectoftheworkofthelastthreesetsofauthorsisthediscoveryofaphasetransitionphenomenon.Simplyput,ifthenon-uniteigenvaluesareclosetoone,thentheirsampleversionswillbehaveinroughlythesamewayasifthetruecovarianceweretheidentity.However,whenthetrueeigenvaluesarelargerthan1+p \r,thesampleeigenvalueshaveadi erentasymptoticproperty.TheresultsofBaiketal.(2005)showann2=3scalingfortheasymptoticdistributionwhenanon-unitpopulationeigenvalueliesbelowthethreshold1+p \r,andann1=2scalingforthoseabovethatthreshold.Thispaperisaboutthecaseofindependentlyandidenticallydistributedob-servationsX1;:::;XnfromanN-variaterealGaussiandistributionwithmeanzeroandcovariance=diag(`1;`2;:::;`M;1;:::;1),where`1`2`M�1.Noticethat,sinceobservationsareGaussian,thereisnolossofgen-eralityinassumingthecovariancematrixtobediagonal:underanorthogonal EIGENSTRUCTUREFORSPIKEDCOVARIANCE1619transformationofthedata,thesampleeigenvaluesareinvariantandthesampleeigenvectorsareequivariant.TheNnmatrixX=(X1:::::Xn)isadoublearray,indexedbybothnandN=N(n)onthesameprobabilityspace,andwithN=n!\r,where\risapositiveconstant.Throughoutitisassumedthat0\r1,althoughmuchoftheanalysiscanbeextendedtothecase\r1withalittleextrawork.TheaimistostudytheasymptoticbehaviorofthelargeeigenvaluesofthesamplecovariancematrixS=(1=n)XXTasn!1.Inthiscontext,theprimaryfocusofstudyisthesecondorderbehavioroftheMlargesteigenvaluesofthesamplecovariancematrix.Distributionallimitsofthesampleeigenvaluesb`arederivedwhen`&#x-3.2;≦1+p \r,forthecasewhen`hasmultiplicityone.Acomprehensivestudyofallpossiblescenariosisbeyondthescopeofthispaper.Thealmostsurelimits(seeTheorem1andTheorem2)ofthesampleeigenvalues,obtainedbyBaikandSilverstein(2006),areusedintheproofsofsomeoftheresults.However,intheGaussiancasethesamelimitscanbederivedthroughtheapproachtakenhereinderivingthedistributionallimitsoftheeigenvaluesandeigenvectors.FordetailsrefertoPaul(2004).Thisalternativeapproachgivesadi erentperspectivetothelimits,inparticulartotheiridenti cationascertainlinearfunctionalsofthelimitingMarcenko-Pasturlawwhenthetrueeigenvalueisabove1+p \r.AnotheraspectofthecurrentapproachisthatitthrowslightonthebehavioroftheeigenvectorsassociatedwiththeMlargesteigenvalues.Thesampleeigenvectorsalsoundergoaphasetransition.Byperforminganaturaldecompositionofthesampleeigenvectorsinto\signal"and\noise"parts,itisshownthatwhen`�1+p \r,the\signal"partoftheeigenvectorsisasymptoticallynormal.Thispaperalsogivesareasonablythoroughdescriptionofthe\noise"partoftheeigenvectors.Theresultsderivedinthispapercontainsomeimportantmessagesforin-ferenceonmultivariatedata.First,thephasetransitionphenomenadescribedinthispapermeansthatsomecommonlyusedtestsforthehypothesis=I,likethelargestroottest(Roy(1953)),maynotreliablydetectsmalldeparturesfromanidentitycovariancewhentheratioN=nissigni cantlylargerthanzero.Atthesametime,Theorem3canbeusedformakinginferenceonthelargerpopulationeigenvalues.ThisisdiscussedfurtherinSection2.2.Second,impor-tantconsequenceoftheresultshereareinsightsastowhyitmightnotbesuchagoodideatousePrincipalComponentAnalysis(PCA)fordimensionreductioninahigh-dimensionalsetting,atleastnotinitsstandardform.ThishasbeenobservedbyJohnstoneandLu(2004),whoshowthatwhenN=n!\r2(0;1),thesampleprincipalcomponentsareinconsistentestimatesofthepopulationprincipalcomponents.Theorem4saysexactlyhowbadthisinconsistencyis 1620DEBASHISPAULanditsproofdemonstratesclearlyhowthisinconsistencyoriginates.Theorem5andTheorem6areimportanttounderstandingthesecondorderbehaviorofthesampleeigenvectors,andhaveconsequencesforanalysesoffunctionaldata.ThisiselaboratedoninSection2.4.Therestofthepaperisorganizedasfollows.Section2hasthemainresults.Section3haskeyquantitiesandexpressionsthatarerequiredtoderivetheresults.Section4isdevotedtoderivingtheasymptoticdistributionofeigenvalues(Theorem3).Section5concernsmatrixperturbationanalysis,whichisakeyingredientintheproofsofTheorem4-6.SomeoftheproofsaregiveninAppendixAandAppendixB.2.DiscussionoftheResultsThroughoutb`isusedtodenotethethlargesteigenvalueofS,and=)isusedtodenoteconvergenceindistribution.2.1.AlmostsurelimitofMlargesteigenvaluesThefollowingresultsareduetoBaikandSilverstein(2006),andareprovedunder nitefourthmomentassumptionsonthedistributionoftherandomvari-ables.Theorem1.Supposethat`1+p \randthatN=n!\r2(0;1),asn!1.Thenb`!(1+p \r)2;almostsurelyasn!1:(1)Theorem2.Supposethat`�1+p \randthatN=n!\r2(0;1)asn!1.Thenb`!`1+\r `1;almostsurelyasn!1:(2)Denotethelimitin(2)by:=`1+\r `1;appears(LemmaB.1)asasolutiontotheequation=`(1+\rZx xdF\r(x))(3)with`=`.SinceF\rissupportedon[(1p \r)2;(1+p \r)2]for\r1(withasingleisolatedpointaddedtothesupportfor\r�1),thefunctionontheRHSismonotonicallydecreasingin2((1+p \r)2;1),whiletheLHSisobviouslyincreasingin.Soasolutionto(3)existsonlyif`1+c\r,forsomec\r�0.Thatc\r=p \risapartofLemmaB.1.Notethatwhen`=1+p \r,=(1+p \r)2isthealmostsurelimitofthejthlargesteigenvalue(forj xed)intheidentitycovariancecase. EIGENSTRUCTUREFORSPIKEDCOVARIANCE16212.2.AsymptoticnormalityofsampleeigenvaluesWhenanon-uniteigenvalueofisofmultiplicityone,andabovethecriticalvalue1+p \r,itisshownthatthecorrespondingsampleeigenvalueisasymptot-icallynormallydistributed.NotethatforthecomplexGaussiancase,aresultintheanalogoussituationhasbeenderivedby(Baiketal.(2005,Thm1.1(b))).Theyshowedthatwhenthelargesteigenvalueisgreaterthan1+p \randofmultiplicityk,thelargestsampleeigenvalue,aftersimilarcenteringandscaling,convergesindistributiontothedistributionofthelargesteigenvalueofakkGUE(GaussianUnitaryEnsemble).Theyalsoderivedthelimitingdistributionsforthecasewhena(non-unit)populationeigenvalueissmallerthan1+p \r.Dis-tributionalaspectofasampleeigenvaluefortherealcaseinthelattersituationisbeyondthescopeofthispaper.Themethodusedinthispaperdi erssubstan-tiallyfromtheapproachtakenbyBaiketal.(2005).TheyusedthejointdensityoftheeigenvaluesofStoderiveadeterminantalrepresentationofthedistributionofthelargesteigenvalueandthenadoptedthesteepestdescentmethodfortheasymptoticanalysis.However,thispaperreliesonamatrixanalysisapproach.Theorem3.Supposethat`�1+p \randthat`hasmultiplicity1.Thenasn;N!1sothatN=n\r=o(n1=2),p n(b`)=)N(0;2(`))(4)where,for`�1+p \rand(`)=`(1+\r=(`1)),2(`)=2`(`) 1+`\rRx ((`)x)2dF\r(x)=2`(`) 1+`\r (`1)2\r=2`2(1\r (`1)2):(5)Inthe xedNcase,whenthetheigenvaluehasmultiplicity1,thethsam-pleeigenvalueisasymptoticallyN(`;(1=n)2`2)(Anderson(1963)).Thusthepositivityofthedimensiontosamplesizeratiocreatesabiasandreducesthevariance.However,if\rismuchsmallercomparedto`,thevariance2(`)isapproximately2`2whichistheasymptoticvarianceinthe xedNcase.Thisiswhatweexpectintuitively,sincetheeigenvectorassociatedwiththissampleeigenvalue,lookingtomaximizethequadraticforminvolvingS(underorthog-onalityrestrictions),willtendtoputmoremassonthethcoordinate.ThisisdemonstratedevenmoreclearlybyTheorem4.Supposethatwetestthehypothesis=Iversusthealternativethat=diag(`1;:::;`M;1;:::;1)with`1`M�1,basedoni.i.d.observationsfromN(0;).If`1�1+p \r,itfollowsfromTheorem2thatthelargestroottestisasymptoticallyconsistent.Forthespecialcasewhen`1isofmultiplicityone,Theorem3givesanexpressionfortheasymptoticpowerfunction,assuming 1622DEBASHISPAULthatN=nconvergesto\rfastenough,asn!1.Onehastoviewthisincontext,sincetheresultisderivedundertheassumptionthat`1;:::;`Mareall xed,andwedonothavearateofconvergenceforthedistributionofb`1towardnormality.However,Theorem3canbeusedto ndcon denceintervalsforthelargereigenvaluesunderthenon-nullmodel.2.3.AnglebetweentrueandestimatedeigenvectorsItiswell-known(see,forexample,Muirhead(1982),orAnderson(1963))that,when=IandtheobservationsareGaussian,thematrixofsampleeigenvectorsofSisHaardistributed.Inthenon-Gaussiansituation,Silverstein(1990)showedweakconvergenceofrandomfunctionsofthismatrix.Inthecontextofnon-identitycovariance,HoyleandRattray(2004)describedaphasetransitionphenomenonintheasymptoticbehavioroftheanglebetweenthetrueandestimatedeigenvectorassociatedwithanon-uniteigenvalue`.Theytermthis\thephenomenonofretardedlearning".Theyderivedthisresultatalevelofrigorconsistentwiththatinthephysicsliterature.Theirresultcanberephrasedinourcontexttomeanthatif1`1+p \risasimpleeigenvalue,thenthecosineoftheanglebetweenthecorrespondingtrueandestimatedeigenvectorsconvergesalmostsurelytozero;yet,thereisastrictlypositivelimitif`�1+p \r.Part(a)ofTheorem4,statedbelowandprovedinSection5,isaprecisestatementofthelatterpartoftheirresult.ThisalsoreadilyprovesastrongerversionoftheresultregardinginconsistencyofsampleeigenvectorsasisstatedinJohnstoneandLu(2004).Theorem4.SupposethatN=n!(0;1)asn;N!1.LeteedenotetheN1vectorwith1inthethcoordinateandzeroselsewhere,andletpdenotetheeigenvectorofSassociatedwiththeeigenvalueb`.(a)If`�1+p \randofmultiplicityone,jhp;eeija:s:!s 1\r (`1)2.1+\r `1asn!1:(6)(b)If`1+p \r,hp;eeia:s:!0asn!1:(7)2.4.DistributionofsampleeigenvectorsExpresstheeigenvectorpcorrespondingtothethsampleeigenvalueasp=(pTA;;pTB;)T,wherepA;isthesubvectorcorrespondingtothe rstMcoordinates.Followtheconventionthatthcoordinateofpisnonnegative.LetekdenotethekthcanonicalbasisvectorinRM.Thenthefollowingholds. EIGENSTRUCTUREFORSPIKEDCOVARIANCE1623Theorem5.Supposethat`�1+p \randthat`hasmultiplicity1.Thenasn;N!1sothatN=n\r=o(n1=2),p n(pA; kpA;ke)=)NM(0;(`));(8)(`)= 1 1\r (`1)2!X1k=M`k` (`k`)2ekeTk:(9)Thefollowingisanon-asymptoticresultaboutthebehaviorofthethsampleeigenvector.Herecanbeanynumberbetween1andmin(n;N).Theorem6.ThevectorpB;=kpB;kisdistributeduniformlyontheunitsphereSNM1andisindependentofkpB;k.Theorem6,takeninconjunctionwithTheorem4andTheorem5,hasin-terestingimplicationsinthecontextoffunctionaldataanalysis(FDA).Oneap-proachinFDAinvolvessummarizingthedataintermsofthe rstfewprincipalcomponentsofthesamplecurves.Acommontechniquehereistoapplyasmooth-ingtothecurvesbeforecarryingoutthePCA.Occassionallythisisfollowedbysmoothingoftheestimatedsampleeigenvectors.RamsayandSilverman(1997)detailedvariousmethodsofcarryingoutafunctionalprincipalcomponentanal-ysis(FPCA).Thinkofasituationwhereeachindividualobservationisarandomfunctionwhosedomainisaninterval.Further,supposethatthesefunctionsarecorruptedwithadditiveandisotropicnoise.Ifthetruefunctionsaresmoothandbelongtoa nite-dimensionallinearspace,thenitispossibletoanalyzethesedatabytransformingthenoisycurvesinasuitableorthogonalbasis,e.g.,awaveletbasisoraFourierbasis.Ifthedataarerepresentedintermsof rstNbasisfunctions(whereNistheresolutionofthemodel,orthenumberofequallyspacedpointswherethemeasurementsaretaken),thenthematrixofcoecientsinthebasisrepresentationcanbewrittenasX,whosecolumnsareN-dimensionalvectorsofwaveletorFouriercoecientsofindividualcurves.Thecorrespondingmodelcanthenbedescribed,undertheassumptionofGaussianity,intermsofanN(m;)model,wheremisthemeanvectorandisthecovariancematrixwitheigenvalues`1`2`M�2==2.Here2isthevarianceoftheisotropicnoiseassociatedwiththesamplecurves.Ourresultsshowthatwhen`�2(1+p \r)andN=nconvergestosome\r2(0;1),itispossibletogiveafairlyaccurateapproximationofthesampleeigenvectorsassociatedwiththe\signal"eigenvalues(uptoaconventiononsign).Thesimilarityoftheresultingexpressionstothestandard\signalplusnoise"modelsprevalentinthenonparametricliterature,anditsimplicationsinthecontextofestimatingtheeigenstructureofthesamplecurves,arebeinginvestigatedbytheauthor. 1624DEBASHISPAUL3.RepresentationoftheEigenvaluesofSThroughoutassumethatnislargeenoughsothatN=n1.PartitionthematrixSasS=SAASABSBASBB;wherethesuxAcorrespondstothesetofcoordinatesf1;:::;Mg,andBcorrespondstothesetfM+1;:::;Ng.Asbefore,useb`andptodenotethethlargestsampleeigenvalueandthecorrespondingsampleeigenvector.Toavoidanyambiguity,followtheconventionthatthethelementofpisnonnegative.WritepaspT=(pTA;;pTB;)anddenotethenormkpB;kbyR.Thenalmostsurely0R1.Withthissettinginplace,expressthe rstMeigen-equationsforSasSAApA;+SABpB;=b`pA;;=1;:::;M;(10)SBApA;+SBBpB;=b`pB;;=1;:::;M;(11)pTA;pA;0+pTB;pB;0=;0;1;0M;(12)where0istheKroneckersymbol.DenotethevectorpA;=kpA;k=pA;=p 1R2bya.Thus,kak=1.Similarlyde neq:=pB;=R,andagainkqk=1.Sincealmostsurely0R1,andb`ISBBisinvertible,itfollowsfrom(11)thatq=p 1R2 R(b`ISBB)1SBAa:(13)Dividebothsidesof(10)byp 1R2andsubstitutetheexpressionforq,toyield(SAA+SAB(b`ISBB)1SBA)a=b`a;=1;:::;M:(14)Thisequationisimportantsinceitshowsthatb`isaneigenvalueofthematrixK(b`),whereK(x):=SAA+SAB(xISBB)1SBA,withcorrespondingeigen-vectora.Thisobservationisthebuildingblockforalltheanalysesthatfollow.However,itismoreconvenienttoexpressthequantitiesintermsofthespectralelementsofthedatamatrixX.Letdenotethediagonalmatrixdiag(`1;:::;`M).Becauseofnormality,theobservationmatrixXcanbereexpressedasXT=[ZTA1=2:ZTB],ZAisMn,ZBis(NM)n.TheentriesofZAandZBarei.i.d.N(0;1),andZAandZBaremutuallyindependent.AlsoassumethatZAandZBarede nedonthesameprobabilityspace. EIGENSTRUCTUREFORSPIKEDCOVARIANCE1625WritethesingularvaluedecompositionofZB=p nas1 p nZB=VM1 2HT;(15)whereMisthe(NM)(NM)diagonalmatrixoftheeigenvaluesofSBBindecreasingorder;Visthe(NM)(NM)matrixofeigenvectorsofSBB;andHisthen(NM)matrixofrightsingularvectors.DenotethediagonalelementsofMby1��NM,suppressingthedependenceonn.NotethatthecolumnsofVformacompleteorthonormalbasisforRNM,whilethecolumnsofHformanorthonormalbasisofan(NM)dimensionalsubspace(therowspaceofZB)ofRn.De neT:=(1=p n)HTZTA.Tisan(NM)Mmatrixwithcolumnst1;:::;tM.ThemostimportantpropertyaboutTisthatthevectorst1;:::;tMaredistributedasi.i.d.N(0;1 nINM)andareindependentofZB.ThisisbecausethecolumnsofHformanorthonormalsetofvectors,therowsofZAarei.i.d.Nn(0;I)vectors,andZAandZBareindependent.Thus,(14)canbeexpressedas,(SAA+1 2TTM(b`IM)1T1 2)a=b`a;=1;:::;M:(16)Also,K(x)canbeexpressedasK(x)=SAA+1 2TTM(xIM)1T1 2:(17)Rewrite(12)intermsofthevectorsfa:=1;:::;MgasaT[I+1 2TT(b`IM)1M(b`0IM)1T1 2]a0=1 1R20;1;0M:(18)Proofsofthetheoremsdependheavilyontheasymptoticbehaviorofthelargesteigenvalue,aswellastheEmpiricalSpectralDistribution(ESD)ofWishartma-tricesinthenull(i.e.,identitycovariance)case.Throughout,theESDofSBBisdenotedbybFn;NM.Weknowthat(Bai(1999))bFn;NM=)F\r;almostsurelyasn!1:(19)Thefollowingresult,provedinAppendixA,isaboutthedeviationofthelargesteigenvalueofSBBfromitslimitingvalue\r:=(1+p \r)2.Theimportanceofthisresultisthat,forprovingthelimittheorems,itisenoughtodocalculationsbyrestrictingattentiontosetsoftheform1\r+forsomesuitablychosen�0. 1626DEBASHISPAULProposition1.Forany0\r=2,P(1\r&#x-278;&#x.223;)exp3n2 64\r;fornn0(\r;);(20)wheren0(\r;)isanintegerlargeenoughthatj(1+p (Nm)=n)2\rj=4fornn0(\r;).4.ProofofTheorem3The rststephereistoutilizetheeigen-equation(16)togetb`=aT(SAA+1 2TTM(b`IM)1T1 2)a:(21)Fromthis,aftersomemanipulations(seeSection5.1fordetails),p n(b`)(1+`tTM(IM)2t+d)=p n(s+`tTM(IM)1t)+oP(1);(22)whered=`(b`)(tTM(IM)2(b`IM)1t+OP(1)),andsisthe(;)thelementofS.Itfollowsreadilythatd=oP(1).ItwillbeshownthatthetermontheRHSof(22)convergesindistributiontoaGaussianrandomvariablewithzeromeanandvariance2`1+`\rZx (x)2dF\r(x):(23)Next,fromProposition2statedbelow(andprovedinAppendixB),itfollowsthattTM(IM)2ta:s:!\rZx (x)2dF\r(x):(24)Hence(4),with2(`)givenbythe rstexpressionin(5),followsfrom(23),(24),(22)andSlutsky'sTheorem.Applicationof(58)givesthesecondequalityin(5),andthethirdfollowsfromsimplealgebra.Proposition2.SupposethatN=n!\r2(0;1)asn!1.Let;�0satisfy[4(\r+=2)=(\r=2)2][(NM)=n]and\r+.Thenthereis EIGENSTRUCTUREFORSPIKEDCOVARIANCE1627n(;;;\r)suchthatforallnn(;;;\r),PjtTjM(IM)2tj\rZx (x)2dF\r(x)j�;1\r+ 22exp n NMn( 4)2(\r 2)4 6(\r+ 2)2!+2exp n n+NMn2( 4)2 2(\r 2)6 162(\r+ 2)!+2exp n n+NMn2( 4)2 2(\r 2)4 4(\r+ 2)!;1jM:ThemaintermontheRHSof(22)canbeexpressedasWn+W0n,whereWn=p n(s(1\r)`+`tTM(IM)1t`1 ntrace((IM)1));W0n=p n`(1 ntrace((IM)1)\r` `1):Notethatby(57),\r`=(`1)=\r[1+(1=(`1))]=\rR[=(x)]dF\r(x).Ontheotherhand1 ntrace((IM)1)=NM nZ xbFn;NM(x):Sincethefunction1=(z)isanalyticinanopensetcontainingtheinterval[(1p \r)2;(1+p \r)2],from(BaiandSilverstein(2004,Thm1.1))thesequenceW0n=oP(1)because(N=n)\r=o(n1=2).4.1.AsymptoticnormalityofWnRecallthatbyde nitionofT,t=(1=p n)HTZA;whereZTA;isthethrowofZA.SinceNMn,andthecolumnsofHareorthonormal,onecanextendthemtoanorthonormalbasisofRngivenbythematrixeH=[H:Hc],whereHcisn(nN+M).Thus,eHeHT=eHTeH=In.Writes=`1 nZTA;ZA;=`1 nZTA;eHeHTZA;=`(k1 p nHTZA;k2+k1 p nHTcZA;k2)=`(ktk2+kwk2);withw:=(1=p n)HTcZA;.ThuswN(0;InN+M=n);tN(0;INM=n);andthesearemutuallyindependentandindependentofZB.Therefore,onecan 1628DEBASHISPAULrepresentWnasasumoftwoindependentrandomvariablesW1;nandW2;n,whereW1;n=`p n(kwk2(1\r));W2;n=`p n(tT(IM)1t1 ntrace((IM)1)):Sincenkwk22nN+MandN=n\r=o(n1=2),itfollowsthatW1;n=)N(0;2`2(1\r)).Later,itisshownthatW2;n=)N(0;2`2\rZ2 (x)2dF\r(x)):(25)Therefore,theasymptoticnormalityofWnisestablished.W0n=oP(1)thenim-pliesasymptoticnormalityoftheRHSof(22).Theexpression(23)forasymptoticvarianceisthendeducedfromtheidentityZ2 (x)2dF\r(x)=1+1 `1+Zx (x)2dF\r(x);whichfollowsfrom(57).Therefore,theasymptoticvarianceofWnis2`2(1\r)+2`2\rZ2 (x)2dF\r(x)=2`2(1+\r `1)+2`2\rZx (x)2dF\r(x);fromwhich(23)followssince`(1+\r=(`1))=.4.2.Proofof(25)Lett=(t;1;:::;t;NM)T,t;ji:i:d:N(0;1=n)andindependentofM.Hence,ifyj=p nt;j,thenW2;n=`1 p nNMXj=11 jy2jNMXj=11 j;wherefyjgNMj=1i:i:d:N(0;1),andfyjgNMj=1isindependentofM.Toestablish(25)itsucestoshowthat,forallt2R,W2;n(t):=Eexp(itW2;n)!e2(`)(t):=expt2e2(`) 2;asn!1;wheree2(`)=2`2\rR2(`)=(2(`)x)2dF\r(x)for`�1+p \r.De neJ\r():=f1\r+g,where�0isanynumbersuchthat�\r+2.Notethat EIGENSTRUCTUREFORSPIKEDCOVARIANCE1629J\r()isameasurablesetthatdependsonn,andP(J\r())!1asn!1byProposition1.Thusweneedonlyestablishthatforallt2R,Eh E(eitW2;njM)expt2e2(`) 21 ;1\r+i!0;asn!1;(26)wheretheouterexpectationiswithrespecttothedistributionofM.Sincethecharacteristicfunctionofa21randomvariableis (x)=1=p 12ix,onthesetf1\r+g,theinnerconditionalexpectationisNMYj=1 t` p n(j)expit` p nNMXj=11 j=NMYj=112it` p n(j)1 2expit` p nNMXj=11 j:(27)Letlogz(z2C)betheprincipalbranchofthecomplexlogarithm.Then12it` p n(j)1 2=exp1 2log12it` p n(j):InviewoftheTaylorseriesexpansionoflog(1+z)(validforjzj1),fornn(;\r;),largeenoughsothat(jtj`)=(p n(\r))1=2,theconditionalexpectation(27)isexp1 2NMXj=11Xk=11 k2it` p n1 jkit` p nNMXj=11 j:Theinnersumisdominatedbyageometricseriesandhenceis nitefornn(;\r;)onthesetJ\r().Interchangingtheorderofsummations,onJ\r()thetermwithintheexponentbecomest2 2h2`221 nNMXj=11 (j)2i+1 21Xk=31 k2it` p nkNMXj=11 (j)k:(28)Denotethe rsttermof(28)byan(t)andthesecondtermbyern(t).Fornn(;\r;),onJ\r(),jern(t)jt2 3h2`221 nNMXj=11 (j)2i2jtj` p n(\r)12jtj` p n(\r)1:(29) 1630DEBASHISPAULLetG2(;;\r;)tobetheboundedfunction(�\r+)de nedthrough(55).ThenonJ\r(),(1=n)PNMj=1[1=(j)2]=((NM)=n)RG2(x;;\r;)dbFn;NM(x)andthequantityontheRHSconvergesalmostsurelyto\rRG2(x;,\r;)dF\r(x)=\rR[1=(x)2]dF\r(x)becauseofthecontinuityofG2(;;\r;)and(19).Moreover,onJ\r(),an(t)andern(t)areboundedfornn(;\r;).Therefore,fromthisobservation,and(27)and(28),thesequencein(26)isboundedbyEexpan(t)+t2e2(`) 2(exp(jern(t)j)1)IJ\r()+E expan(t)+t2e2(`) 21 IJ\r();whichconvergestozerobytheBoundedConvergenceTheorem.5.ApproximationtotheEigenvectorsThissectiondealswithanasymptoticexpansionoftheeignvectoraofthematrixK(b`)associatedwiththeeigenvalueb`,when`isgreaterthan1+p \randhasmultiplicity1.ThisexpansionhasalreadybeenusedintheproofofTheorem3.Animportantstep,presentedthroughLemma1,istoprovideasuitableboundfortheremainderintheexpansion.TheapproachtakenherefollowstheperturbationanalysisapproachinKneipandUtikal(2001),(seealsoKato(1980,Chap.2)).Forthebene tofthereaders,thestepsleadingtotheexpansionareoutlinedbelow.Firstobservethatistheeigenvalueof(=`)associatedwiththeeigen-vectore.De neR=MXk=` (`k`)ekeTk:(30)NotethatRistheresolventof(=`)\evaluated"at.Thenutilizethede ningequation(16)towrite( `I)a=(K(b`) `)a+(b`)a:De neD=K(b`)(=`);premultiplybothsidesbyR;andobservethatR((=`)I)=IMeeT:=P?.ThenP?a=RDa+(b`)Ra:(31)Asaconvention,supposethathe;ai0.Thenwritea=he;aie+P?aandobservethatRe=0.Henceae=RDe+r;(32) EIGENSTRUCTUREFORSPIKEDCOVARIANCE1631wherer=(1he;ai)eRD(ae)+(b`)R(ae).Now,de ne =kRDk+jb`jkRk;and =kRDek:(33)Thefollowinglemmagivesaboundontheresidualr.TheproofcanbefoundinPaul(2004).Lemma1.rsatis eskrk(  (1+ ) 1 (1+ )+  (1 (1+ ))2if p 51 2; 2+2 always.(34)Thenexttaskistoestablishthat =oP(1)and =oP(1).First,writeD=(SAA)+1 2TTM(IM)1T1 ntrace(M(IM)1)I1 2+1 ntrace(M(IM)1)\rZx xdF\r(x)+(b`)1 2TTM(IM)1(b`IM)1T1 2:(35)Sinceb`a:s:!�\rand1a:s:!\r,inviewoftheanalysiscarriedoutinSection4,itisstraightforwardtoseethatkDka:s:!0.Therefore, a:s:!0and a:s:!0from(33).However,itispossibletogetamuchbetterboundfor .De neV(i;):=TTM(IM)iT(1=n)trace(M(IM)i)Ifori=1;2.ExpandDuptosecondorderaround,andobservethatRe=0foranydiagonalmatrix.Fromthis,and(17),itfollowsthatRDe=R(K() `)e+(b`)R K()e+(b`)2 r;(36)where K()=1=2V(2;)1=2;andk rk=OP(1).Notethat(i)allterms,exceptthoseonthediagonal,ofthematrix1=2TTM(IM)iT1=2eareOP(n1=2)fori=1;2(fromaninequalitysimilarto(41)),(ii)allexceptthediagonalofSAAareOP(n1=2),and(iii)Risdiagonalwith(;)thentryequalto0.Itnowfollowseasilythat =OP(n1 2)+(b`)2OP(1):(37)5.1.Explanationforexpansion(22)TheRHSof(21)canbewrittenaseTK(b`)e+2eTK(b`)(ae)+(ae)TK(b`)(ae):(38) 1632DEBASHISPAULThe rsttermin(38)isthemajorcomponentof(22),anditcanbewrittenass+`tTM(IM)1t+(b`)`tTM(IM)2t+(b`)2`tTM(IM)2(b`IM)1t:Again,from(32),(33),(34)and(37),(ae)TK(b`)(ae)=kaek2OP(1)= 2OP(1)=OP(n1)+(b`)2OP(n1 2)+(b`)4OP(1):Tocheckthenegligibilityofthesecondtermin(38),observethatby(32),eTK(b`)(ae)=eTDRDe+eTK(b`)r=eTDRDe+oP(n1 2)+(b`)2oP(1);wherethelastequalityisdueto(34),(37)and =oP(1).Use(36),andthede nitionofR,togettheexpressioneTDRDe=MXj=` `j` `j`"sj p `j`+V(1;)j+(b`)V(2;)j+(b`)2eV(3;)j#2;whereeV(3;)=TTM(IM)2(b`IM)1T.Observethatforj=,eachofthetermssj,V(1;)jandV(2;)jisOP(n1=2)andeV(3;)j=OP(1).ItfollowsthateTDRDe=OP(n1)+(b`)2OP(n1=2)+(b`)4OP(1):5.2.ProofofTheorem4Part(a).Asaconvention,choosehp;eei0.First,notethatwithpA;asin(10),hp;eei=hpA;;ei=p 1R2ha;ei.Since a:s:!0and a:s:!0,from(32)and(34),itfollowsthatha;eia:s:!1.Therefore,from(18),(24),Theorem2andtheabovedisplay,1 1R2a:s:!1+`\rZx (x)2dF\r(x);fromwhich(6)followsinviewofLemmaB.2.Part(b).From(18),itisclearthatfor(7)tohold,eitheraT1=2TTM(b`IM)2T1=2aa:s:!1,orha;eia:s:!0.Hence,itsucestoshowthatthesmallesteigenvalueofthematrixE:=TTM(b`IM)2Tdivergestoin nityalmost EIGENSTRUCTUREFORSPIKEDCOVARIANCE1633surely.Theapproachistoshowthatgiven�0,thereisaC�0suchthattheprobabilityP(min(E)C)issummableovern,andthatC!1as!0.DenotetherowsofTbytTj,j=1;:::;NM(treatedasan1Mvector);tj'saretobedistinguishedfromthevectorst1;:::;tM,thecolumnsofT.Infact,tTj=(tj1;:::;tjM).ThenE=NMXj=1j (b`j)2tjtTjNMXj=j (b`j)2tjtTj=:E;say,inthesenseofinequalitiesbetweenpositivesemi-de nitematrices.Thusmin(E)min(E).Thenontheset J1;:=fb`\r+;1\r+=2g,ENMXj=j (\r+j)2tjtTj=: Esinceb`,bytheinterlacinginequalityofeigenvaluesofsymmetricmatrices(Rao(1973,Sec.1f)).Thus,inviewofProposition1,weneedonlyprovidealowerboundforthesmallesteigenvalueof E.However,itismoreconvenienttoworkwiththematrix E=NMXj=1j (\r+j)2tjtTj=TTM((\r+)IM)2T:(39)ProvingsummabilityofP(min( E)C; J1;)(whereC!1as!0)suces,becauseitiseasytocheckthatP(k E Ek�; J1;)issummable.ByProposition2,given0[16(\r+=2)=2][(NM)=n],thereisann(;;\r)suchthat,forallj=1;:::;M,forallnn(;;\r),P(jtTjM((\r+)IM)2tj\rZx (\r+x)2dF\r(x)j�; J1;)"1(n);(40)where"1(n)issummableinn.Ontheotherhand,sincep ntjN(0;INM)forj=1;:::;M,andonJ\r(=2)kM((\r+)IM)2k=1 (\r+1)2\r+ 2 ( 2)2;LemmaA.1impliesthat,forj=kand0[2(\r+=2)=2][(NM)=n],P(jtTjM((\r+)IM)2tkj&#x-3.2;≦; J1;)2expn NMn2( 2)4 3(\r+ 2)2:(41) 1634DEBASHISPAULDenotetheRHSof(41)by"2(n).Noticethatjmin( E)min(D E)jk ED EkHS,wherekkHSdenotestheHilbert-SchmidtnormandD Eisthediagonalof E.Hence,from(40)and(41),withM=(1+p M(M1)),P(min( E)\rZx (\r+x)2dF\r(x)M; J1;)M"1(n)+M(M1) 2"2(n)(42)for0[2(\r+=2)=2][(NM)=n],andforallnn(;;\r).If02\r,thenitcanbecheckedthatRx=(\r+x)2dF\r(x)&#x-3.2;≦(1=16p \r)(1=p ).Therefore,set=(1+p M(M1))1andchoosesmallenoughsothat(p \r=16)(1=p )�0.CallthelastquantityC,andobservethatC!1as!0.By(42),theresultfollows.5.3.ProofofTheorem5Usethefact(duetoTheorem3)thatb`=OP(n1=2)andequations(32),(34)and(37)togetp n(ae)=p nRDe+oP(1):(43)Sinceb`=OP(n1=2),andsincetheo -diagonalelementsofthematrixV(2;)areOP(n1=2),from(36),p n(ae)=MXk=` p `k` (`k`)p nsk p `k`+p ntTkM(IM)1tek+oP(1):(44)Now,tocompletetheproofusethesametrickasintheproofofTheorem3.Usingthesamenotationwritesk=p `k`[tTkt+wTkw],wherefor1kM,wki:i:d:N(0;(1=n)InN+M),tki:i:d:N(0;(1=n)INM),andfwkgandftkgaremutuallyindependent,andindependentofZB.Theasymptoticnormalityofp n(ae)canthenbeestablishedbyprovingtheasymptoticnormalityofW3;n:=PMk=(`=)[p `k`=(`k`)]p nwTkwekandW4;n:=PMk=(`=)[p `k`=(`k`)]p ntTk(I+M(IM)1)teksepa-rately.Itcanbeshownthat,W3;n=)NM(0;(1\r)`2 2MXk=`k` (`k`)2ekeTk): EIGENSTRUCTUREFORSPIKEDCOVARIANCE1635ToprovetheasymptoticnormalityofW4;n, xanarbitraryvectorb2RM,andconsiderthecharacteristicfunctionofbTW4;n,conditionalonM.AsintheproofofTheorem3,itisenoughthat,forsomepre-speci ed�0,E E(eiaTW4;njM)exp1 2bTe(`)b1 ;1\r+!0asn!1;(45)wheree(`)=(`)(1\r)`2 2X1k=M`k` (`k`)2ekeTk=\r`2Z1 (x)2dF\r(x)X1k=M`k` (`k`)2ekeTk:De neJ\r():=f1\r+gwhereissmallenoughsothat�\r+2.De neC=`2X1k=M`k` (`k`)2ekeTk:ChoosenlargeenoughsothatbTCbn(\r)2.ThenobservethatonJ\r(),E(eibTW4;njM)=E[E(eibTW4;njt;M)jM]=Eexp1 2bTCbtT(IM)2tjM=NMYj=11+bTCb n(j)21 2;(46)wherethe rstandthelaststepsowetothefactthatthetkN(0;INM=n)andareindependentfork=1;:::;M.TherestoftheproofimitatestheargumentsusedintheproofofTheorem3andisomitted.5.4.ProofofTheorem6Aninvarianceapproachistakentoprovetheresult.UsethenotationofSection3.WritepB;aspB;=pB;(;ZA;ZB).Thatis,treatpB;asamapfromRMRMnR(NM)nintoRNMthatmaps(;ZA;ZB)tothevectorwhichistheB-subvectorofthetheigenvectorofS.Also,recallthatR=kpB;k.Denotetheclassof(NM)(NM)orthogonalmatricesbyONM.FromthedecompositionofSinSection3,itfollowsthat,forG2ONMarbitrary,GpB;=pB;(;ZA;GZB),i.e.,forevery xedZA=zA,asafunctionofZB,GpB;(;zA;ZB)=pB;(;zA;GZB):(47) 1636DEBASHISPAULDe ne,forr�0,Ar:=Ar(zA)=fzB2R(NM)n:kpB;(;zA;zB)k=rg.Then(47)impliesthatGAr=ArforallG2ONM;forallr�0:(48)Thus,forany0r1r21,thesetfkpB;(;zA;zB)k2[r1;r2)g=[r1rr2Ar(zA);isinvariantunderrotationontheleft.NotethatSr1rr2Ar(zA)isthezA-sectionofthesetfR2[r1;r2)gandhenceismeasurable(w.r.t.the-algebrageneratedbyZB).Fromthisitfollowsthatforevery xedZA=zA,andeveryBorelmeasurablesubsetHoftheunitsphereSNM1,forallG2ONMitismeaningfultowriteP(R2[r1;r2);GpB;(;zA;ZB) R2HjZA=zA)=P(ZB2[r1rr2Ar(zA);GpB;(;zA;ZB) kpB;(;zA;ZB)k2HjZA=zA)=P(GZB2[r1rr2Ar(zA);pB;(;zA;GZB) kpB;(;zA;GZB)k2HjZA=zA)(by(47)and(48))=P(ZB2[r1rr2Ar(zA);pB;(;zA;ZB) kpB;(;zA;ZB)k2HjZA=zA)(sinceGZBL=ZB;independentofZA)=P(R2[r1;r2);pB;(;zA;ZB) R2HjZA=zA):Fromtheequalityofthe rstandthelastterms,bystandardargumentsweconcludethatforallBorelsubsetHofSNM1,P(GpB; R2HjR;ZA)=P(pB; R2HjR;ZA)forallG2ONM.ThisrotationalinvariancemeansthatgivenRandZA,theconditionaldistributionofq=pB;=RisuniformonSNM1.Moreover,sincetheconditionaldistributionofqdoesnotdependon(R;ZA),qand(R;ZA)areindependent.Inparticular,themarginaldistributionofqisuniformonSNM1.AcknowledgementTheauthorissupportedinpartsbytheNIHgrant#R01EB1988-09.TheauthorthanksProfessorIainJohnstoneforsomethoughtfuldiscussions,andalso EIGENSTRUCTUREFORSPIKEDCOVARIANCE1637forpointingtotwoimportantreferences;ProfessorRichardOlshenforsomecon-structivesuggestionsregardingthepresentationofthematerial;andDr.JinhoBaikforsomeusefulcommunications.Healsothankstwoanonymousrefereesfortheirsuggestionsthatimprovedthepresentation.AppendixAA.1.WeakconcentrationinequalitiesforrandomquadraticformsThefollowingtwolemmasarereferredtoasweakconcentrationinequalities.SupposethatC:X!Rnnisameasurablefunction.LetZbearandomvariabletakingvaluesinX.LetkCkdenotetheoperatornormofC,i.e.,thelargestsingularvalueofK.LemmaA.1.SupposethatXandYarei.i.d.Nn(0;I)independentofZ.ThenforeveryL�0and01,forall0t=(1)L,P1 njXTC(Z)Yj�t;kC(Z)kL2exp(1)nt2 2L2:(49)LemmaA.2.SupposethatXisdistributedasNn(0;I)independentofZ.AlsoletC(z)=CT(z)forallz2X.ThenforeveryL�0and01,forall0t2=(1)L,P(1 njXTC(Z)Xtrace(C(Z))j�t;kC(Z)kL)2exp(1)nt2 4L2:(50)Theproofsinvolvestandardargumentsandareomitted.A.2.ProofofProposition1Inordertoprovethisresultthefollowingresult,apartofTheorem2.13ofDavidsonandSzarek(2001),isused.LemmaA.3.LetZbeapqmatrixofi.i.d.N(0;1=q)entrieswithpq.Letsmax(Z)andsmin(Z)denotethelargestandthesmallestsingularvalueofZ,respectively.ThenP(smax(Z)�1+r p q+t)eqt2 2;(51)P(smin(Z)1r p qt)eqt2 2:(52)Takep=NM,q=nandZ=ZB=p n.Notethatp 1=smaxZB=p n.Letm1:=(1+p (NM)=n)2.Foranyr�0,ifp 1p m1r,then 1638DEBASHISPAUL1m1(p 1+p m1)maxfp 1p m1;0g(2p m1+r)r.Thisimplies,by(51),thatenr2 2P(smax(1 p nZB)�p m1+r)P(1m1�r(2p m1+r)):Sett=r(2p m1+r)=(r+p m1)2m1.Solvingforronegetsthat,fort�0,r=p t+m1p m1.Substituteinthelastdisplaytoget,fort�0,P(1m1�t)exphn 2(p t+m1p m1)2i:(53)Now,tocompletetheproof,observethatp t+m1p m1=t p t+m1+p m1�t 2p t+m1t 2p t+\r+=4fornn0(\r;).Since\r=2impliesthat\r+3\r=2,fornn0(\r;),onehasp 3=4+m1p m1�p 6=(8p \r).Theresultfollowsift=3=4issubstitutedin(53),sincejm1\rj=4.A.3.AconcentrationinequalityforLipschitzfunctionalsWerestateCorollary1.8(b)ofGuionnetandZeitouni(2000)inourcontext.LemmaA.4.SupposethatYisanmnmatrix,mn,withindependent(realorcomplex)entriesYklwithlawPkl,1km;1ln.LetS=YYbeageneralizedWishartmatrix,whereisadiagonalmatrixwithreal,nonnegativediagonalentriesandspectralradius�0.SupposethatthefamilyfPkl:1km;1lngsatis esthelogarithmicSobolevinequalitywithuniformlyboundedconstantc.Thenforanyfunctionfsuchthatg(x):=f(x2)isLipschitz,forany�0,P 1 mtracef(1 m+nS)E(1 mtracef(1 m+nS)) �2expm22 2cjgj2L;(54)wherejgjListheLipschitznormofg.De neGk(x;;\r;)=(1 (x)kx\r+1 (\r)kx�\r+;where�\r+;k=1;2;::::(55)Thende nefk(x)=Gk(x;;\r;),k=1;2,gk(x)=fk(x2),andnoticethatgk(x)isLipschitzwithjgkjL=[2k(\r+)1=2]=[(\r)k+1].Takem=NM, EIGENSTRUCTUREFORSPIKEDCOVARIANCE1639Y=ZBand=[(m+n)=n]In,andrecallthatN(0;1)satis eslogarithmicSobolevinequalitywithconstantc=1(Bogachev(1998,Thm.1.6.1)).Further,=(m+n)=nandS=(m+n)SBB.ThereforeLemmaA.4impliesthefollowing.PropositionA.1.Fork=1;2,andany�0,P 1 ntraceGk(SBB;;\r;)E(1 ntraceGk(SBB;;\r;)) �2exp n n+NMn22 2(\r)2(k+1) 4k2(\r+)!:(56)AppendixBB.1.ExpressionforLemmaB.1.For`1+p \r,=`(1+[\r=(`1)])solves(3).Proof.Firststepistoshowthatforany`�1+p \r,Zx (`)xf\r(x)dx=1 `1;(57)where(`)=`(1+[\r=(`1)]),andf\r(x)isthedensityofMarcenko-Pasturlawwithparameter\r(1).Thisisobtainedbydirectcomputation.Thecase`=1+p \rfollowsfromthisandtheMonotoneConvergenceTheorem.ThefollowingresultiseasilyobtainedfromLemmaB.1.LemmaB.2.For`�1+p \r,Zx ((`)x)2dF\r(x)=1 (`1)2\r:(58)B.2.ProofofProposition2First,considertheexpansiontTjM(IM)2tj\rZx (x)2dF\r(x)=tTjM(IM)2tj1 ntrace(M(IM)2)+1 ntrace(M(IM)2)\rZx (x)2dF\r(x): 1640DEBASHISPAULSplitthesecondtermfurtheras1 ntrace((IM)2)\rZ1 (x)2dF\r(x)1 ntrace((IM)1)\rZ1 xdF\r(x)=(I)(II);say:De neJ\r(=2):=f1\r+=2g.Inordertoboundthe rstterm,noticethatp ntjN(0;INM)forj=1;:::;M,andonJ\r(=2),kM(IM)2k=1 (1)2\r+ 2 (\r 2)2:Thus,applyLemmaA.2toconcludethat(setting=1=3inthelemma),P tTjM(IM)2tj1 ntrace(M(IM)2)  4;J\r( 2)2exp n NM(\r 2)4n( 4)2 6(\r+ 2)2!for04(\r+ 2) (\r 2)2NM n:(59)Toprovideboundsfor(I),observethatonJ\r(=2),trace((IM)2)=traceG2(SBB;;\r;=2),whereG2(;;;)isde nedthrough(55).Therefore,PropositionA.1impliesthatP 1 ntrace((IM)2)E(1 ntraceG2(SBB;;\r;=2)) � 4;J\r( 2)2exp n n+NMn2( 4)2 2(\r 2)6 162(\r+ 2)!:(60)Analogously,useG1(;;;)insteadofG2(;;;)intheanalysisof(II),toobtainP 1 ntrace((IM)1)E(1 ntraceG1(SBB;;\r;=2)) � 4;J\r( 2)2exp n n+NMn2( 4)2 2(\r 2)4 4(\r+ 2)!:(61)Take Fn;NMtobetheexpectedESDofSBB.Now,totackletheremaindersin(I)and(II)noticethat,sinceG1andG2arecontinuousandboundedintheir rstargument,fork=1;2,E1 ntraceGk(SBB;;\r; 2)=NM nZGk(x;;\r; 2)d Fn;NM(x): EIGENSTRUCTUREFORSPIKEDCOVARIANCE1641Bai(1993)provedunderfairlyweakconditions(Bai(1993,Thm.3.2),thatif1p=n2where01121,thenk Fn;pF nk1C1(1;2)n5 48;(62)whereFp=ndenotestheMarcenko-Pasturlawwithparameterp=n.Herekk1meansthesup-normandC1isaconstantwhichdependson1;2.SinceF(NM)=n=)F\r,asn!1,andG1(;;\r;=2)isboundedandcon-tinuous,andRG1(x;;\r;=2)dF\r(x)=R(x)1dF\r(x)from(62),itfollowsthatthereisn1(;;;\r)1suchthat,fornn1(;;;\r), E1 ntraceG1(SBB;;\r; 2)\rZ1 xdF\r(x)  8:(63)Similarly,thereisn2(;;;\r)n1(;;;\r)suchthatfornn2(;;;\r), E1 ntraceG2(SBB;;\r; 2)\rZ1 (x)2dF\r(x)  8:(64)Combine(59),(60),(61),(64)and(63),andtheresultfollows.ReferencesAnderson,T.W.(1963).Asymptotictheoryofprincipalcomponentanalysis.Ann.Math.Statist.34,122-148.Bai,Z.D.(1993).Convergencerateofexpectedspectraldistributionsoflargerandommatrices.PartII.Samplecovariancematrices.Ann.Probab.21,649-672.Bai,Z.D.(1999).Methodologiesinspectralanalysisoflargedimensionalrandommatrices,areview.Statist.Sinica9,611-677.Bai,Z.D.andSilverstein,J.W.(2004).CLTforlinearspectralstatisticsoflargedimensionalsamplecovariancematrix.Ann.Probab.32,553-605.Bai,Z.D.andYin,Y.Q.(1993).Limitofthesmallesteigenvalueofalargedimensionalsamplecovariancematrix.Ann.Probab.21,1275-1294.Baik,J.,BenArous,G.andPeche(2005).Phasetransitionofthelargesteigenvaluefornon-nullcomplexcovariancematrices.Ann.Probab.33,1643-1697.Baik,J.andSilverstein,J.W.(2006).Eigenvaluesoflargesamplecovariancematricesofspikedpopulationmodels.J.MultivariateAnal.97,1382-1408.(AlsoarXiv:math.ST/048165v1).Bogachev,V.I.(1998).GaussianMeasures.AmericanMathematicalSociety.Buja,A.,Hastie,T.andTibshirani,R.(1995).Penalizeddiscriminantanalysis.Ann.Statist.23,73-102.Davidson,K.R.andSzarek,S.(2001).Localoperatortheory,randommatricesandBanachspaces.In\HandbookontheGeometryofBanachspaces",(V.1,Johnson,W.B.,Lenden-strauss,J.eds.),317-366,ElsevierScience.Guionnet,A.andZeitouni,O.(2000).Concentrationofthespectralmeasureforlargematrices.ElectronicCommunicationsinProbability5,119-136. 1642DEBASHISPAULHoyle,D.andRattray,M.(2003).LimitingformofthesamplecovarianceeigenspectruminPCAandkernelPCA.AdvancesinNeuralInformationProcessingSystems(NIPS16).Hoyle,D.andRattray,M.(2004).Principalcomponentanalysiseigenvaluespectrafromdatawithsymmetrybreakingstructure.Phys.Rev.E69,026124.Johnstone,I.M.(2001).Onthedistributionofthelargestprincipalcomponent.Ann.Statist.29,295-327.Johnstone,I.M.andLu,A.Y.(2004).Sparseprincipalcomponentanalysis.(ToappearinJ.Amer.Statist.Assoc.).Kneip,A.andUtikal,K.J.(2001).Inferencefordensityfamiliesusingfunctionalprincipalcomponentanalysis.J.Amer.Statist.Assoc.96,519-542.Kato,T.(1980).PerturbationTheoryofLinearOperators.Springer-Verlag,NewYork.Muirhead,R.J.(1982).AspectsofMultivariateStatisticalTheory.JohnWiley,NewYork.Paul,D.(2004).Asymptoticsoftheleadingsampleeigenvaluesforaspikedcovariancemodel.Technicalreport,StanfordUniversity.http://anson.ucdavis.edu/debashis/techrep/eigenlimit.pdfPeche,S.(2003).Universalityoflocaleigenvaluestatisticsforrandomsamplecovariancema-trices.Ph.D.Thesis,EcolePolytechniqueFederaledeLausanne.Ramsay,J.O.andSilverman,B.W.(1997).FunctionalDataAnalysis.Springer-Verlag,NewYork.Rao,C.R.(1973).LinearStatisticalInferenceanditsApplications.Wiley,NewYork.Roy,S.N.(1953).OnaheuristicmethodoftestconstructionanditsuseinMultivariateanal-ysis.Ann.Math.Statist.24,220-238.Silverstein,J.W.(1990).Weakconvergenceofrandomfunctionsde nedbytheeigenvectorsofsamplecovariancematrices.Ann.Probab.18,1174-1194.Silverstein,J.W.andChoi,S.(1995).Analysisofthelimitingspectraldistributionoflargedimensionalrandommatrices.J.MultivariateAnal.54,295-309.Soshnikov,A.(2002).Anoteonuniversalityofthedistributionofthelargesteigenval-uesincertainsamplecovariancematrices.J.Statist.Phys.108,1033-1056.(AlsoarXiv:math.PR/0104113v2).Telatar,E.(1999).Capacityofmulti-antennaGaussianchannels.EuropeanTransactionsonTelecommunications10,585-595.Tyler,D.E.(1983).Asymptoticdistributionofthesamplerootsforanonnormalpopulation.Biometrika63,639-645.DepartmentofStatistics,UniversityofCalifornia,Davis,OneShieldsAvenue,Davis,CA95616.E-mail:debashis@wald.ucdavis.edu(ReceivedJune2005;acceptedFebruary2006)