edu Abstract We consider the sparse Fourier transform problem given a complex vector of length and a parameter estimate the largest in magnitude coe64259cients of the Fourier transform of The problem is of key interest in several areas including s ID: 24425
Download Pdf The PPT/PDF document "Simple and Practical Algorithm for Spars..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
SimpleandPracticalAlgorithmforSparseFourierTransformHaithamHassaniehMITPiotrIndykMITDinaKatabiMITEricPriceMITfhaithamh,indyk,dk,ecpriceg@mit.eduAbstractWeconsiderthesparseFouriertransformproblem:givenacomplexvectorxoflengthn,andaparameterk,estimatetheklargest(inmagnitude)coecientsoftheFouriertransformofx.Theproblemisofkeyinterestinseveralareas,includingsignalprocessing,audio/image/videocompression,andlearningtheory.Weproposeanewalgorithmforthisproblem.Thealgorithmleveragestechniquesfromdigitalsignalpro-cessing,notablyGaussianandDolph-Chebyshevlters.Unlikethetypicalapproachtothisproblem,ouralgo-rithmisnotiterative.Thatis,insteadofestimating\large"coecients,subtractingthemandrecursingonthereminder,itidentiesandestimatestheklargestcoecientsin\oneshot",inamannerakintosketch-ing/streamingalgorithms.Theresultingalgorithmisstructurallysimplerthanitspredecessors.Asaconse-quence,weareabletoextendconsiderablytherangeofsparsity,k,forwhichthealgorithmisfasterthanFFT,bothintheoryandpractice.1IntroductionTheFastFourierTransform(FFT)isoneofthemostfundamentalnumericalalgorithms.ItcomputestheDiscreteFourierTransform(DFT)ofann-dimensionalsignalinO(nlogn)time.Thealgorithmplaysacentralroleinseveralapplicationareas,includingsignalpro-cessingandaudio/image/videocompression.Itisalsoafundamentalsubroutineinintegermultiplicationandencoding/decodingoferror-correctingcodes.AnyalgorithmforcomputingtheexactDFTmusttaketimeatleastproportionaltoitsoutputsize,whichis(n).Inmanyapplications,however,mostoftheFouriercoecientsofasignalaresmallorequaltozero,i.e.,theoutputoftheDFTis(approximately)sparse.Forexample,atypical8x8blockinavideoframehasonaverage7non-negligiblecoecients(i.e.,89%ofthecoecientsarenegligible)[CGX96].Imagesandaudiodataareequallysparse.Thissparsityprovidesthera-tionaleunderlyingcompressionschemessuchasMPEGandJPEG.OtherapplicationsofsparseFourieranalysisincludecomputationallearningtheory[KM91,LMN93],analysisofbooleanfunctions[KKL88,O'D08],multi-scaleanalysis[DRZ07],compressedsensing[Don06,CRT06],similaritysearchindatabases[AFS93],spec-trumsensingforwidebandchannels[LVS11],anddata-centermonitoring[MNL10].WhentheoutputoftheDFTissparseorapprox-imatelysparse,onecanhopeforan\output-sensitive"algorithm,whoseruntimedependsonk,thenumberofcomputed\large"coecients.Formally,givenacom-plexvectorxwhoseFouriertransformis^x,werequirethealgorithmtooutputanapproximation^x0to^xthatsatisesthefollowing`2=`2guarantee:1(1.1)k^x ^x0k2Cmink-sparseyk^x yk2;whereCissomeapproximationfactorandthemini-mizationisoverk-sparsesignals.Notethatthebestk-sparseapproximationcanbeobtainedbysettingall 1Thealgorithminthispaperhasasomewhatstrongerguar-antee;see\Results"formoredetails. butthelargestkcoecientsof^xto0.SuchavectorcanberepresentedusingonlyO(k)numbers.Thus,ifkissmall,theoutputofthealgorithmcanbeexpressedsuccinctly,andonecanhopeforanalgorithmwhoseruntimeissublinearinthesignalsizen.Therstsuchsublinearalgorithm(fortheHadamardtransform)waspresentedin[KM91].Shortlyafter,severalsublinearalgorithmsfortheFouriertrans-formoverthecomplexeldwerediscovered[Man92,GGI+02,AGS03,GMS05,Iwe10a,Aka10].2Thesealgorithmshavearuntimethatispolynomialinkandlogn.3Theexponentsofthepolynomials,how-ever,aretypicallylarge.ThefastestamongthesealgorithmshavearuntimeoftheformO(k2logcn)(asin[GGI+02,Iwe10a])ortheformO(klogcn))(asin[GMS05]),forsomeconstantc2.Inpractice,theexponentsintheruntimeofthesealgorithmsandtheircomplexstructureslimittheirap-plicabilitytoonlyverysparsesignals.Inparticular,themorerecentalgorithmswereimplementedandevaluatedempiricallyagainstFFTW,anecientimplementationoftheFFTwitharuntimeO(nlogn)[Iwe10a,IGS07].Theresultsshowthatthealgorithmin[GMS05]iscom-petitivewithFFTWforn=222andk135[IGS07].Thealgorithmsin[GGI+02,Iwe10a]requireanevensparsersignal(i.e.,largernandsmallerk)tobecom-petitivewithFFTW.Results.Inthispaper,weproposeanewsub-linearalgorithmforsparseFouriertransformoverthecomplexeld.Thekeyfeatureofouralgorithmisitssimplicity:thealgorithmhasasimplestructure,whichleadstoecientruntimewithlowbig-Ohconstant.Specically,forthetypicalcaseofnapowerof2,ouralgorithmhastheruntimeofO(logn nklogn):Thus,thealgorithmisfasterthanFFTWforkuptoO(n=logn).Incontrast,earlieralgorithmsrequiredasymptoticallysmallerboundsonk.Thisasymptoticimprovementisalsore\rectedinempiricalruntimes.Forexample,forn=222,ouralgorithmoutperformsFFTWforkuptoabout2200,whichisanorderofmagnitudehigherthanpreviouslyachieved.Theestimationsprovidedbyouralgorithmsatisfytheso-called`1=`2guarantee.Specically,letybetheminimizerofk^x yk2.Foraprecisionparameter=1=nO(1),andaconstant0,our(randomized) 2See[GST08]forastreamlinedexpositionofsomeofthealgorithms.3AssumingallinputsarerepresentedusingO(logn)bits.algorithmoutputs^x0suchthat:(1.2)k^x ^x0k21k^x yk22=k+kxk21withprobability1 1=n.Theadditivetermthatdependsonappearsinallpastalgorithms[Man92,GGI+02,AGS03,GMS05,Iwe10a,Aka10],althoughtypically(withtheexceptionof[Iwe10b])itiseliminatedbyassumingthatallcoordinatesareintegersintherangef nO(1):::nO(1)g.Inthispaper,wekeepthedependenceonexplicit.The`1=`2guaranteeofEquation(1.2)isstrongerthanthe`2=`2guaranteeofEquation1.1.Inparticular,the`1=`2guaranteewithaconstantapproximationfactorCimpliesthe`2=`2guaranteewithaconstantapproximationfactorC0,ifonesetsallbuttheklargestentriesin^x0to0.4Furthermore,insteadofboundingonlythecollectiveerror,the`1=`2guaranteeensuresthateveryFouriercoecientiswell-approximated.Techniques.Westartwithanoverviewofthetechniquesusedinpriorworks,thendescribeourcon-tributioninthatcontext.Atahighlevel,sparseFourieralgorithmsworkbybinningtheFouriercoecientsintoasmallnumberofbuckets.Sincethesignalissparseinthefrequencydomain,eachbucketislikely5tohaveonlyonelargecoecient,whichcanthenbelocated(tonditsposition)andestimated(tonditsvalue).Forthealgorithmtobesublinear,thebinninghastobedoneinsublineartime.Toachievethisgoal,thesealgo-rithmsbintheFouriercoecientusingann-dimensionalltervectorGthatisconcentratedbothintimeandfrequency,i.e.,Giszeroexceptatasmallnumberoftimecoordinates,anditsFouriertransform^Gisnegli-gibleexceptatasmallfraction(about1=k)ofthefre-quencycoordinates(the\pass"region).DependingonthechoiceofthelterG,pastalgorithmscanbeclassi-edas:iteration-basedorinterpolation-based.Iteration-basedalgorithmsusealterthathasasig-nicantmassoutsideitspassregion[Man92,GGI+02,GMS05].Forexample,thepapers[GGI+02,GMS05]setGtothebox-carfunction,inwhichcase^GistheDirichletkernel,whosetaildecaysinaninverselinearfashion.Sincethetaildecaysslowly,theFouriercoef-cientsbinnedtoaparticularbucket\leak"intootherbuckets.Ontheotherhand,thepaper[Man92]esti-matestheconvolutionintimedomainviarandomsam-pling,whichalsoleadstoalargeestimationerror.Tore-ducetheseerrorsandobtainthe`2=`2guarantee,thesealgorithmshavetoperformmultipleiterations,where 4Thisfactwasimplicitin[CM06].Foranexplicitstatementandproofsee[GI10],remarksafterTheorem2.5Onecanrandomizethepositionsofthefrequenciesbysam-plingthesignalintimedomainappropriately[GGI02,GMS05].Seesection3part(b)forthedescription. eachiterationestimatesthelargestFouriercoecient(theoneleastimpactedbyleakage)andsubtractsitscontributiontothetimesignal.Theiterativeupdateofthetimesignalcausesalargeincreaseinruntime.Thealgorithmsin[GGI+02,Man92]performthisupdatebygoingthroughO(k)iterationseachofwhichupdatesatleastO(k)timesamples,resultinginanO(k2)termintheruntime.Thealgorithm[GMS05],introduceda"bulksampling"algorithmthatamortizesthispro-cessbutitrequiressolvinginstancesofanon-uniformFouriertransform,whichisexpensiveinpractice.Interpolation-basedalgorithmsarelesscommonandlimitedtothedesignin[Iwe10a].Thisapproachusesaleakage-freelter,G,toavoidtheneedforitera-tion.Specically,thealgorithmin[Iwe10a]usesforGalterinwhichG=1iimodn=p=0andG=0oth-erwise.TheFouriertransformofthislterisa\spiketrain"withperiodp.Thislterdoesnotleak:itisequalto1on1=pfractionofcoordinatesandiszeroelsewhere.Unfortunately,however,suchalterrequiresthatpdi-videsn;moreover,thealgorithmneedsseveraldierentvaluesofp.Sinceingeneralonecannotassumethatnisdivisiblebyallnumbersp,thealgorithmtreatsthesignalasacontinuousfunctionandinterpolatesitattherequiredpoints.Interpolationintroducesadditionalcomplexityandincreasestheexponentsintheruntime.Ourapproach.Thekeyfeatureofouralgorithmistheuseofadierenttypeoflter.Inthesimplestcase,weusealterobtainedbyconvolvingaGaussianfunctionwithabox-carfunction.AmoreecientltercanbeobtainedbyreplacingtheGaussianfunctionwithaDolph-Chebyshevfunction.(SeeFig.1foranillustration.)Becauseofthisnewlter,ouralgorithmdoesnotneedtoeitheriterateorinterpolate.Specically,thefrequencyresponseofourlter^Gisnearly\ratinsidethepassregionandhasanexponentialtailoutsideit.Thismeansthatleakagefromfrequenciesinotherbucketsisnegligible,andhence,ouralgorithmneednotiterate.Also,lteringcanbeperformedusingtheexistinginputsamplesx,andhenceouralgorithmneednotinterpolatethesignalatnewpoints.Avoidingbothiterationandinterpolationisthekeyfeaturethatmakesouralgorithmecient.Further,oncealargecoecientisisolatedinabucket,oneneedstoidentifyitsfrequency.Incontrasttopastworkwhichtypicallyusesbinarysearchforthistask,weadoptanideafrom[PS10]andtailorittoourproblem.Specically,wesimplyselectthesetof\large"binswhicharelikelytocontainlargecoecients,anddirectlyestimateallfrequenciesinthosebins.Tobalancethecostofthebinselectionandestimationsteps,wemakethenumberofbinssomewhatlargerthanthetypicalvalueofO(k).Specically,weuseBp nk,whichleadstothestatedruntime.62NotationTransform-RelatedNotations.Foraninputsignalx2Cn,itsFourierspectrumisdenotedby^x.Wewillusexytodenotetheconvolutionofxandy,andxytodenotethecoordinate-wiseproductofxandy.Recallthatdxy=^x^y.Wedene!=e2i=ntobeaprimitiventhrootofunity(herei=p 1,butintherestofthepaper,iisanindex).Indices-RelatedNotations.Alloperationsonindicesinthispaperaretakenmodulon.Thereforewemightrefertoann-dimensionalvectorashavingcoordinatesf0;1;:::;n 1gorf0;1;:::;n=2; n=2+1;:::; 1g.interchangeably.Finally,[s]referstothesetofindicesf0;:::;s 1g,whereassupp(x)referstothesupportofvectorx,i.e.,thesetofnon-zerocoordinates.Inthispaperweassumethatnisanintegerpowerof2.3Basics(a)WindowFunctions.Indigitalsignalpro-cessing[OSB99]onedeneswindowfunctionsinthefollowingmanner:Definition3.1.Wedenea(;;w)standardwin-dowfunctiontobeasymmetricvectorF2Rnwithsupp(F)[ w=2;w=2]suchthat^F0=1,^F0foralli2[ n;n],andj^Fjforalli=2[ n;n].Claim3.2.Foranyand,thereexistsan(;;O(1 log(1=)))standardwindowfunction.Proof.Thisisawellknownfact[Smi11].Forex-ample,foranyand,onecanobtainastan-dardwindowbytakingaGaussianwithstandardde-viation( log(1=)=)andtruncatingitatw=O(1 log(1=))).TheDolph-Chebyshevwindowfunctionalsohastheclaimedpropertybutwithminimalbig-Ohconstant[Smi11](inparticular,halftheconstantofthetruncatedGaussian).Theabovedenitionshowsthatastandardwindowfunctionactslikealter,allowingustofocusonasubsetoftheFouriercoecients.Ideally,however,wewouldlikethepassregionofourltertobeas\rataspossible. 6Althoughitisplausiblethatonecouldcombineourlterswiththebinarysearchtechniqueof[GMS05]andachieveanalgorithmwithaO(klogcn)runtime,ourpreliminaryanalysisindicatesthattheresultingalgorithmwouldbeslower.Intuitively,observethatforn=222andk=211,thevaluesofp nk=216:592681andklog2n=45056arequiteclosetoeachother. Definition3.3.Wedenea(;0;;w)\ratwindowfunctiontobeasymmetricvectorF2Rnwithsupp(F)[ w=2;w=2]suchthat^F2[1 ;1+]foralli2[ 0n;0n]andj^Fjforalli=2[ n;n].A\ratwindowfunction(liketheoneinFig.1)canbeob-tainedfromastandardwindowfunctionbyconvolvingitwitha\boxcar"windowfunction,i.e.,aninterval.Specically,wehavethefollowing.Claim3.4.Forany,0,andwith0,thereexistsan(;0;;O(1 0logn ))\ratwindowfunction.Notethatinourapplicationswehave1=nO(1)and=20.Thusthewindowlengthswofthe\ratwindowfunctionandthestandardwindowfunctionarethesameuptoaconstantfactor.Proof.Letf=( 0)=2,andletFbean(f; (0+)n;w)standardwindowfunctionwithminimalw=O(2 0log(+0)n ).Wecanassume;01=(2n)(because[ n;n]=f0gotherwise),solog(+0)n =O(logn ).Let^F0bethesumof1+2(0+f)nadjacentcopiesof^F,normalizedto^F001.Thatis,wedene^F0=P(0+f)nj= (0+f)n^F+j Pfnj= fn^Fjsobytheshifttheorem,inthetimedomainF0a/Fa(0+f)nXj= (0+f)n!j:Since^F0forjijfn,thenormalizationfactorPfnj= fn^Fjisatleast1.Foreachi2[ 0n;0n],thesumontopcontainsallthetermsfromthesumonbottom.Theother20ntermsinthetopsumhavemagnitudeatmost=((0+)n)==(2(0+f)n),soj^F0 1j20n(=(2(0+f)n)).Forjij-531;.992;n,however,^F02(0+f)n=(2(0+f)n).ThusF0isan(;0;;w)\ratwindowfunction,withthecorrectw.(b)PermutationofSpectra.Following[GMS05],wecanpermutetheFourierspectrumasfol-lowsbypermutingthetimedomain:Claim3.5.DenethetransformP;suchthat,givenann-dimensionalvectorx,anintegerthatisinvertiblemodn,andaninteger2[n],(P;x)=xi+.Then(\P;x)i=^x! i.Proof.Foralla,(\P;x)a=Pnj=1xj+!aj=Pnj=1xj!a(j ) 1=^xa 1! a 1.Lemma3.6.Ifj=0,nisapoweroftwo,andisauniformlyrandomoddnumberin[n],thenPr[j2[ C;C]]4C=n.Proof.Ifj=a2forsomeodda,thentheorbitofjisa02forallodda0.Therearethus2round(C=2+1)4C=2+1possiblevaluesin[ C;C]outofn=2+1suchelementsintheorbit,forachanceofatmost4C=n.Notethatforsimplicity,wewillonlyanalyzeouralgorithmwhennisapoweroftwo.Forgeneraln,theanalogofLemma3.6wouldloseann='(n)=O(loglogn)factor,whereisEuler'stotientfunction.Thiswillcorrespondinglyincreasetherunningtimeofthealgorithmongeneraln.Claim3.5allowsustochangethesetofcoe-cientsbinnedtoabucketbychangingthepermutation;Lemma3.6boundstheprobabilityofnon-zerocoe-cientsfallingintothesamebucket.(c)SubsampledFFT.Supposewehaveavectorx2CnandaparameterBdividingn,andwouldliketocompute^y=^x(n=B)fori2[B].Claim3.7.^yistheB-dimensionalFouriertransformofy=Pn=B 1j=0x+Bj:Therefore^ycanbecomputedinO(jsupp(x)j+BlogB)time.Proof.^x(n=B)=n 1Xj=0xj!ij(n=B)=B 1Xa=0n=B 1Xj=0xBj+a!(Bj+a)n=B=B 1Xa=0n=B 1Xj=0xBj+a!ian=B=B 1Xa=0ya!ian=B=^y;asdesired.4AlgorithmAkeyelementofouralgorithmistheinnerloop,whichndsandestimateseach\large"coecientwithconstantprobability.Inx4.1wedescribetheinnerloop,andinx4.2weshowhowtouseittoconstructthefullalgorithm.4.1InnerLoopLetBbeaparameterthatdi-videsn,tobedeterminedlater.LetGbea(1=B;1=(2B);;w)\ratwindowfunctionforsomeandw=O(Blogn ).Wewillhave1=nc,soonecanthinkofitasnegligiblysmall.Therearetwoversionsoftheinnerloop:locationloopsandestimationloops.Locationloopsaregivenaparameterd,andoutputasetI[n]ofdkn=Bcoordinatesthatcontainseachlargecoecientwith 00.511.52050100150200250G(linearscale) (a) 00.20.40.60.81050100150200250bG(linearscale) (b) 1e-081e-071e-061e-050.00010.0010.010.11050100150200250G(logscale) (c) 1e-121e-101e-081e-060.00010.011050100150200250bG(logscale) (d)Figure1:Anexample\ratwindowfunctionforn=256.Thisisthesumof31adjacent(1=22;10 8;133)Dolph-Chebyshevwindowfunctions,givinga(0:11;0:06;210 9;133)\ratwindowfunction(althoughourproofonlyguaranteesatolerance=2910 8,theactualtoleranceisbetter).ThetoprowshowsGandbGinalinearscale,andthebottomrowshowsthesamewithalogscale. \good"probability.EstimationloopsaregivenasetI[n]andestimatexIsuchthateachcoordinateisestimatedwellwith\good"probability.1.Choosearandom;2[n]withodd.2.Deney=G(P;x),soy=Gxi+.Thensupp(y)supp(G)=[w].3.Compute^z=^y(n=B)fori2[B].ByClaim3.7,thisistheDFTofz=Pdw=Be 1j=0y+jB.4.Denethe\hashfunction"h:[n][B]byh(i)=round(iB=n)andthe\oset"o:[n][ n=(2B);n=(2B)]byo(i)=i h(i)(n=B).5.Locationloops:letJcontainthedkcoordinatesofmaximummagnitudein^z.OutputI=fi2[n]jh(i)2Jg,whichhassizedkn=B.6.Estimationloops:fori2I,estimate^xas^x0=^zh()!i=^Go().ByClaim3.7,wecancompute^zinO(w+BlogB)=O(Blogn )time.LocationloopsthustakeO(Blogn +dkn=B)timeandestimationloopstakeO(Blogn +jIj)time.Fig.2illustratestheinnerloop.Forestimationloops,wegetthefollowingguaran-tee:Lemma4.1.LetSbethesupportofthelargestkcoecientsof^x,and^x Scontaintherest.Thenforany1,Pr;[j^x0 ^xj2 kk^x Sk22+32k^xk21]O(k B):Proof.Bythedenitions,^x0=^zh()!i=^Go()=^yi o()!i=^Go():Considerthecasethat^xiszeroeverywherebutati,sosupp(\P;x)=fig.Then^yistheconvolutionof^Gand\P;x,and^Gissymmetric,so^yi o()=(^G\P;x)i o()=^G o()\P;xi=^Go()^x! iwhichshowsthat^x0=^xinthiscase.But^x0 ^xislinearin^x,soingeneralwecanassume^x=0andboundj^x0j.Sincej^x0j=j^zh()=^Go()jj^zh()=(1 )j=j^yi o()j=(1 ),itissucienttoboundj^yi o()j.DeneT=fj2[n]j(i j)2[ 2n=B;2n=B]g.Wehavethatj^yi o()j2=n 1X=0(\P;x)^Gi o()2=n 1Xj=0(\P;x)j^G( j) o()22Xj2T(\P;x)j^G( j) o()2+2Xj=2T(\P;x)j^G( j) o()22(1+)2Xj2T(\P;x)j2+22Xj=2T(\P;x)j2Inthelaststep,thelefttermfollowsfromj^Gaj1+foralla.Therighttermisbecauseforj=2T,j(i j) o(i)jj(i j)j jo(i)j-0.8;أ倀2n=B n=(2B)-0.8;أ倀n=B,andj^Gajforjaj-0.8;أ倀n=B.ByClaim3.5,thisbecomes:j^yi o()j22(1+)2Xj2T^xj! j2+22k^xk21:DeneV=Pj2T^xj! j2.Asachoiceover,VistheenergyofarandomFouriercoecientofthevector^xTsowecanboundtheexpectationover:E[V]=k^xTk22:Now,foreachcoordinatej=i,Pr[j2T]8=BbyLemma3.6.ThusPr[S\T=;]8k=B,sowithprobability1 8k=Boverwehavek^xTk22=\r\r^xTnS\r\r22:LetEbetheeventthatthisoccurs,soEis0withprobability8k=Band1otherwise.WehaveE;[EV]=E[Ek^xTk22]=E[E\r\r^xTnS\r\r22]E[\r\r^xTnS\r\r22]:Furthermore,weknowE;[EV]E[\r\r^xTnS\r\r22]=Xj=2Sj^xjj2Pr[(i j)2[ 2n=B;2n=B]]8 Bk^x Sk22 00.20.40.60.810=23=22MagnitudePermutedsignal \P;x (a) 00.20.40.60.810=23=22MagnitudeConvolvedsignal \P;x^y=\GP;x (b) 00.20.40.60.810=23=22MagnitudeSamplesactuallycomputed ^y=\GP;x^z (c) 00.20.40.60.810=23=22MagnitudeRegionsestimatedlarge ChosenregionSamplecuto\P;x^z (d)Figure2:Exampleinnerloopofthealgorithmonsparseinput.Thisrunhasparametersn=256,k=4,Gbeingthe(0:11;0:06;210 9;133)\ratwindowfunctioninFig.1,andselectingthetop4ofB=16samples.Inpart(a),thealgorithmbeginswithtimedomainaccesstoP;xgivenby(P;x)=xi+,whichpermutesthespectrumofxbypermutingthesamplesinthetimedomain.Inpart(b),thealgorithmcomputesthetimedomainsignaly=GP;x.Thespectrumofy(pictured)islargearoundthelargecoordinatesofP;x.Thealgorithmthencomputes^z,whichistherateBsubsamplingof^yaspicturedinpart(c).Duringestimationloops,thealgorithmestimates^xbasedonthevalueofthenearestcoordinatein^z,namely^zh().Duringlocationloops(part(d)),thealgorithmchoosesJ,thetopdk(here,4)coordinatesof^z,andselectstheelementsof[n]thatareclosesttothosecoordinates(theshadedregionofthepicture).ItoutputsthesetIofpreimagesofthoseelements.Inthisexample,thetwocoordinatesontheleftlandedtoocloseinthepermutationandforma\hashcollision".Asaresult,thealgorithmmissesthesecondfromtheleftcoordinateinitsoutput.OurguaranteeisthateachlargecoordinatehasalowprobabilityofbeingmissedifweselectthetopO(k)samples. byLemma3.6.Therefore,byMarkov'sinequalityandaunionbound,wehaveforanyC0thatPr;[V8C Bk^x Sk22]Pr;[E=0[EVCE;[EV]]8k B+1=C:HencePr;[j^yi o()j216C B(1+)2k^x Sk22+22k^xk21]8k B+1=C:ReplacingCwithB=(32k)andusingj^x0 ^xjj^yi o()j=(1 )givesPr;[j^x0 ^xj2 2k(1+ 1 )2k^x Sk22+2 1 2k^xk21](8+32=)k B:whichimpliestheresult.Furthermore,sincej^Go()j2[1 ;1+],j^zh()jisagoodestimateforj^xj|thedivisionismainlyusefulforxingthephase.Thereforeinlocationloops,wegetthefollowingguarantee:Lemma4.2.DeneE= kk^x Sk22+32k^xk21tobetheerrortoleratedinLemma4.1.Thenforanyi2[n]withj^xj4E,Pr[i=2I]O(k B+1 d)Proof.Withprobabilityatleast1 O(k B),j^x0jj^xj E3E.Inthiscasej^zh()j3(1 )E.Thusitissucienttoshowthat,withprobabilityatleast1 O(1 d),thereareatmostdklocationsj2[B]withj^zjj3(1 )E.Sinceeach^zjcorrespondston=Blocationsin[n],wewillshowthatwithprobabilityatleast1 O(1 d),thereareatmostdkn=Blocationsj2[n]withj^x0jj3(1 )2E.LetU=fj2[n]jj^xjjEg,andV=fj2[n]jj^x0j ^xjjEg.Thereforefj2[n]jj^x0jj2EgU[V.Since23(1 )2,wehavefjjj^x0jj3(1 )2EgjU[Vj:WealsoknowjUjk+kx Sk22 E2k(1+1=)andbyLemma4.1,E[jVj]O(kn B):ThusbyMarkov'sinequalityPr[jVjdkn B]O(1 d),orPr[jU[Vjdkn B+k(1+1=)]O(1 d):SincetheRHSisonlymeaningfulford1=wehavedkn Bk(1+1=).ThereforePr[jU[Vjdkn B]O(1 d):andthusPr[jfj2[B]jj^zjj3(1 )Egjkd]O(1 d):Henceaunionboundoverthisandtheprobabilitythatj^x0 ^xjEgivesPr[i=2I]O(k B+1 d)asdesired.4.2OuterLoopOuralgorithmisparameterizedbyand.ItrunsL=O(logn)iterationsoftheinnerloop,withparametersB=O( nk log(n=))7andd=O(1=)aswellas.1.Forr2f1;:::;Lg,runalocationinnerlooptogetIr.2.Foreachi2I=I1[[IL,lets=jfrji2Irgj.3.LetI0=fi2IjsL=2g.4.Forr2f1;:::;Lg,runanestimationinnerlooponI0toget^xrI0.5.Fori2I0,estimate^x0=median(f^xrji2I0g),wherewetakethemedianinrealandimaginarycoordinatesseparately.Lemma4.3.ThealgorithmrunsintimeO( nklog(n=) logn).Proof.Toanalyzethisalgorithm,notethatjI0jL 2Xs=XrjIrj=Ldkn=B 7NotethatBischoseninordertominimizetherunningtime.Forthepurposeofcorrectness,itsucesthatBck=forsomeconstantc.Wewillusethisfactlaterintheexperimentalsection. orjI0j2dkn=B.ThereforetherunningtimeofboththelocationandestimationinnerloopsisO(Blogn +dkn=B).ComputingI0andcomputingthemediansbothtakelineartime,namelyO(Ldkn=B).ThusthetotalrunningtimeisO(LBlogn +Ldkn=B).PlugginginB=O( nk log(n=))andd=O(1=),thisrunningtimeisO( nklog(n=) logn).WerequireB=\n(k=),however;forkn=log(n=),thiswouldcausetheruntimetobelarger.Butinthiscase,thepredictedruntimeis\n(nlogn)already,sothestandardFFTisfasterandwecanfallbackonit.Theorem4.4.Runningthealgorithmwithparameters;1gives^x0satisfyingk^x0 ^xk21 kk^x Sk22+2k^xk21:withprobability1 1=nandrunningtimeO( nklog(n=) logn).Proof.DeneE= kk^x Sk22+32k^xk21Lemma4.2saysthatineachlocationiterationr,foranyiwithj^xj4E,Pr[i=2Ir]O(k B+1 d)1=4:ThusE[s]3L=4,andeachiterationisanindependenttrial,sobyaChernoboundthechancethatsL=2isatmost1=2\n(L)1=n3.Thereforebyaunionbound,withprobabilityatleast1 1=n2,i2I0foralliwithj^xj4E.Next,Lemma4.1saysthatforeachestimationiterationrandindexi,Pr[j^xr ^xjE]O(k B)1=4:Therefore,withprobability1 2 \n(L)1 1=n3,j^xr ^xjEinatleast2L=3oftheiterations.Sincereal(^x0)isthemedianofthereal(^xr),theremustexisttworwithj^xr ^xjEbutonereal(^xr)abovereal(^x0)andonebelow.Henceoneoftheserhasjreal(^x0 ^x)jjreal(^xr ^x)jE,andsimilarlyfortheimaginaryaxis.Thenj^x0 ^xjp 2max(jreal(^x0 ^x)j;jimag(^x0 ^x)j)p 2E:ByaunionboundoverI0,withprobabilityatleast1 1=n2wehavej^x0 ^xjp 2Eforalli2I0.Sincealli=2I0have^x0=0andj^xj4Ewithprobability1 1=n2,withtotalprobability1 2=n2wehavek^x0 ^xk2116E2=16 kk^x Sk22+482k^xk21:Rescalingandgivesourtheorem.5ExperimentalEvaluation5.1ImplementationWeimplementouralgorithminC++usingtheStandardTemplateLibrary,andrefertothisimplementationassFFT(whichstandsfor\sparseFFT").Wehavetwoversions:sFFT1.0,whichimplementsthealgorithmasinx4,andsFFT2.0,whichaddsaheuristictoimprovetheruntime.sFFT2.0.Theideaoftheheuristicistoapplythelterusedin[Man92]torestrictthelocationsofthelargecoecients.TheheuristicisparameterizedbyanMdividingn.Duringapreprocessingstage,itdoesthefollowing:1.Choose2[n]uniformlyatrandom.2.Fori2[M],sety=x(n=M)+.3.Compute^y.4.OutputT[M]containingthe2klargestelementsof^y.Observethat^y=Pn=Mj=0^xMj+! (+Mj).Thus,E[j^yj2]=XjmodMj^xjj2:Thismeansthatthelterisveryecient,inthatithasnoleakageatall.Also,itissimpletocompute.Unfortunately,itcannotbe\randomized"usingP;:afterpermutingby,anytwocollidingelementsjandj0(i.e.,suchthatj=j0(modM))continuetocollide.Nevertheless,if^xjislarge,thenj(modM)islikelytolieinT|atleastheuristicallyonrandominput.sFFT2.0completesthealgorithmassumingthatall\large"coecientsjhavejmodMinT.Thatis,inthemainalgorithmofx4,wethenrestrictoursetsIrtocontainonlycoordinatesiwith(imodM)2T.WeexpectthatjIrj2k Mdkn=Bratherthanthepreviousdkn=B.ThismeansthatourheuristicwillimprovetheruntimeoftheinnerloopsfromO(Blog(n=)+dkn=B)toO(Blog(n=)+k Mdkn=B+M+dk),atthecostofO(MlogM)preprocessing.Notethatonworstcaseinput,sFFT2.0maygiveincorrectoutputwithhighprobability.Forexample,ifx=1wheniisamultipleofn=Mand0otherwise,theny=0withprobability1 M=nandthealgorithmwilloutput0oversupp(x).However,inpracticethealgorithmworksfor"sucientlyrandom"x. Claim5.1.Asaheuristicapproximation,sFFT2.0runsinO((k2nlog(n=)=)1=3logn)aslongask2nlog(n=).Justication.FirstwewillshowthattheheuristicimprovestheinnerlooprunningtimetoO(Blog(n=)+k Mdkn=B+M+dk),thenoptimizetheparametersMandB.Heuristically,onewouldexpecteachoftheIrtobeajTj MfactorsmallerthanifwedidnotrequiretheelementstolieinTmoduloM.Hence,weexpecteachoftheIrandI0tohavesizejTj Mdkn=B=O(k Mdkn=B).Thenineachlocationloop,ratherthanspendingO(dkn=B)timetolistouroutput,wespendO(k Mdkn=B)time|plusthetimerequiredtogureoutwheretostartlistingcoordinatesfromeachofthedkchosenelementsJof^z.WedothisbysortingJandfiji2Tg(modM),thenscanningthroughtheelements.IttakesO(M+dk)timetosortO(dk)elementsin[M],sothetotalruntimeofeachlocationloopisO(Blog(n=)+k Mdkn=B+M+dk).Theestimationloopsareevenfaster,sincetheybenetfromjI0jbeingsmallerbutavoidtheM+dkpenalty.ThefullalgorithmdoesO(MlogM)preprocessingandrunstheinnerloopL=O(logn)timeswithd=O(1=).Therefore,givenparametersBandM,thealgorithmtakesO(MlogM+Blogn logn+k Mkn Blogn+Mlogn+k logn)time.OptimizingoverB,wetakeO(Mlogn+k n Mlog(n=)logn+k logn)time.Then,optimizingoverM,thisbecomesO((k2nlog(n=)=)1=3logn+k logn)time.Ifk2nlog(n=),thersttermdominates.Notethatthisisan(nlog(n=) k)1=6factorsmallerthantherunningtimeofsFFT1.0.5.2NumericalResultsWeevaluatetheperfor-manceofsFFT1.0andsFFT2.0,andcomparethemagainsttwobaselines:1)FFTW3.2.2[FJ],whichisthefastestpublicimplementationforcomputingtheDFTandhasaruntimeofO(nlog(n)),and2)AAFFT0.9[Iwe08],whichisthepriorsublinearalgorithmwiththefastestempiricalruntime[IGS07].Forcompleteness,wecompareagainsttwovariantsofFFTW:basicandoptimized.Theoptimizedversionrequirespreprocess-ing,duringwhichthealgorithmistunedtoaparticularmachinehardware.Incontrast,ourcurrentimplemen-tationsofsFFTvariantsdonotperformhardwarespe-cicoptimizations.ExperimentalSetup.Thetestsignalsaregener-atedinamannersimilartothatin[IGS07].Fortherun-timeexperiments,kfrequenciesareselecteduniformlyatrandomfrom[n]andassignedamagnitudeof1andauniformlyrandomphase.Therestaresettozero.Forthetolerancetonoiseexperiments,thetestsignalsaregeneratedasbeforebuttheyarecombinedwithad-ditivewhiteGaussiannoise,whosevariancevariesde-pendingonthedesiredSNR.Eachpointinthegraphsistheaverageover100runswithdierentinstancesoftestsignalsanddierentinstancesofnoise.Inallex-periments,theparametersofsFFT1.0,sFFT2.0,andAAFFT0.9arechosensothattheaverageL1errorintheabsenceofnoiseisbetween10 7and10 8pernon-zerofrequency.8Finally,allexperimentsarerunonaDualCoreIntel3.0GHzCPUrunningUbuntuLinux10.04withacachesizeof6144KBand8GBofRAM.Runtimevs.SignalSize.Inthisexperiment,wexthesparsityparameterk=50andreporttheruntimeofthecomparedalgorithmsfor12dierentsignalsizesn:214;215;:::;226.Weplot,inFig.3(a),themean,maximum,andminimumruntimesforsFFT1.0,sFFT2.0,AAFFT0.9,FFTW,andFFTWOPT,over100runs.TherelativeruntimesofAAFFT0.9andFFTWareconsistentwiththosereportedin[IGS07],Fig.3.1.Asexpected,Fig.3(a)showsthattheruntimesofsFFTandFFTW(andtheirvariants)areapproxi-matelylinearinthelogscale.However,theslopeofthelineforsFFTislessthantheslopeforFFTW,whichisaresultofsFFT'ssub-linearruntime.Further,thegureshowsthatforsignalsizesn100;000bothsFFT1.0andsFFT2.0arefasterthanbothvariantsofFFTWatrecoveringtheexact50non-zerocoecients.Ontheotherhand,theruntimeofAAFFT0.9isproportionaltopolylog(n)andthusitappearsalmostconstantasweincreasethesignalsize.Forn=226(i.e.,67,108,864),AAFFT0.9eventuallybeatstheruntimeofsFFT1.0andisonly2timesslowerthansFFT2.0.Overall,foralargerangeofsignalsizesfromabout100,000to67,108,864,sFFThasthefastestruntime.Runtimevs.NumberofNon-ZeroFrequen-cies.Inthisexperiment,wexthesignalsizeton=222(i.e.4,194,304)andevaluatetheruntimeofsFFTvs.thenumberofnon-zerofrequenciesk.Foreachvalueofk,theexperimentisrepeated100times.Fig.3(b)il-lustratesthemean,maximum,andminimumruntimesforthecomparedalgorithms.Fig.3(b)showsthatsFFT1.0andsFFT2.0haveafasterruntimethanbasicFFTWforkupto2000and 8Forthevaluesofkandnthatareclosetotheonesconsideredin[IGS07],weusetheparameterstherein.Forotherranges,wefollowtheguidelinesintheAAFFT0.9documentation[Iwe08]. 0.001 0.01 0.1 1 10 214 215 216 217 218 219 220 221 222 223 224 225 226 Run Time (sec)Signal Size (n)Run Time vs Signal Size (k=50)sFFT 1.0 sFFT 2.0 FFTW FFTW OPT AAFFT 0.9 (a) 0.01 0.1 1 10 100 1000 Run Time (sec)Number of Non-Zero Frequencies (k)Run Time vs Sparsity (n=222)sFFT 1.0 sFFT 2.0 FFTW FFTW OPT AAFFT 0.9 (b)Figure3:(a)Runtimevs.signalsize.Thegureshowsthatforalargerangeofsignalsizes,n2[217;226],sFFTisfasterthanFFTWandthestate-of-the-artsublinearalgorithm.(b)Runtimevs.sparsityparameter.ThegureshowsthatsFFTsignicantlyextendstherangeofapplicationsforwhichsparseapproximationofDFTispractical,andbeatstheruntimeofFFTWforvaluesofkorderofmagnitudelargerthanthoseachievedbypastwork.2200,respectively.WhencomparedtotheoptimizedFFTW,thecrossingvaluesbecome500and1000.Thus,sFFT'scrossingvaluesarearoundp n.Incomparison,AAFFT0.9isfasterthanFFTWvariantsforkbetween100and200.Further,therelativeruntimesofAAFFT0.9,andFFTW3.2.2areclosetothosereportedin[IGS07],Fig.3.2.Finally,FFTWhasaruntimeofO(nlog(n)),whichisindependentofthenumberofnon-zerofrequenciesk,ascanbeseeninFig.3(b).Thus,asthesparsityofthesignaldecreases(i.e.,kincreases),FFTWeventuallybecomesfasterthansFFTandAAFFT.Nonetheless,theresultsshowthatincomparisonwiththefastestpriorsublinearalgorithm[IGS07],sFFTsignicantlyextendstherangeofapplicationsforwhichsparseapproximationofDFTispractical.RobustnesstoNoise.Last,wewouldliketocheckthatsFFT'sreducedruntimedoesnotcomeattheexpenseofreducingrobustnesstonoise.Thus,wecomparetheperformanceofsFFT1.0andsFFT2.0againstAAFFT0.9,fordierentlevelsofwhiteGaussiannoise.Specically,wexthen=222andk=50,andexperimentwithdierentsignalSNRs.9WechangetheSNRbychangingthevarianceoftheGaussiannoise.Foreachnoisevariance,werunmultipleexperimentsbyregeneratingnewinstancesofthesignalandnoisevectors.Foreachrun,wecomputetheerrormetricperastheaverageL1errorbetweentheoutput 9TheSNRisdenedasSNR=20logjjxjj2 jjzjj2,wherezisann-dimensionalnoisevector.approximation^x0(restrictedtoitsklargestentries)andthebestk-sparseapproximationof^xreferredtoas^y:AverageL1Error=1 kX0inj^x0 ^yj:Fig.4plotstheaverageerrorpernon-zerofrequencyforsFFT1.0,sFFT2.0,andAAFFT0.9.Thegureshowsthatallthreealgorithmsarestableundernoise.Further,sFFTvariantsappeartobemorerobusttonoisethanAAFFT0.9.References[AFS93]R.Agrawal,C.Faloutsos,andA.Swami.Ecientsimilaritysearchinsequencedatabases.Int.Conf.onFoundationsofDataOrganizationandAlgorithms,pages69{84,1993.[AGS03]A.Akavia,S.Goldwasser,andS.Safra.Provinghard-corepredicatesusinglistdecoding.FOCS,pages146{,2003.[Aka10]A.Akavia.Deterministicsparsefourierapproxima-tionviafoolingarithmeticprogressions.COLT,pages381{393,2010.[CGX96]A.Chandrakasan,V.Gutnik,andT.Xanthopou-los.Datadrivensignalprocessing:Anapproachforenergyecientcomputing.InternationalSymposiumonLowPowerElectronicsandDesign,1996.[CM06]G.CormodeandS.Muthukrishnan.Combinatorialalgorithmsforcompressedsensing.SIROCCO,2006.[CRT06]E.Candes,J.Romberg,andT.Tao.Robustuncertaintyprinciples:Exactsignalreconstructionfromhighlyincompletefrequencyinformation.IEEE 1e-07 1e-06 1e-05 0.0001 0.001 0.01 0.1 1 -20 0 20 40 60 80 100 Average L1 Error per EnrtySNR (dB)Robustness vs SNR (n=222, k=50)sFFT 1.0 sFFT 2.0 AAFFT 0.9 Figure4:RobustnesstoNoise(n=222;k=50).ThegureshowsthatbothsFFTandAAFFT0.9arestableinthepresenceofnoisebutsFFThaslowererrors.TransactionsonInformationTheory,52:489{509,2006.[Don06]D.Donoho.Compressedsensing.IEEETransac-tionsonInformationTheory,52(4):1289{1306,2006.[DRZ07]I.Daubechies,O.Runborg,andJ.Zou.Asparsespectralmethodforhomogenizationmultiscaleprob-lems.MultiscaleModel.Sim.,6(3):711{740,2007.[FJ]M.FrigoandS.G.Johnson.Fftw3.2.3..[GGI02]A.Gilbert,S.Guha,P.Indyk,M.Muthukrish-nan,andM.Strauss.Near-optimalsparsefourierrep-resentationsviasampling.STOC,2002.[GI10]A.GilbertandP.Indyk.Sparserecoveryusingsparsematrices.ProceedingsofIEEE,2010.[GMS05]A.Gilbert,M.Muthukrishnan,andM.Strauss.Improvedtimeboundsfornear-optimalspacefourierrepresentations.SPIEConference,Wavelets,2005.[GST08]A.C.Gilbert,M.J.Strauss,andJ.A.Tropp.Atutorialonfastfouriersampling.SignalProcessingMagazine,2008.[IGS07]M.A.Iwen,A.Gilbert,andM.Strauss.Empiricalevaluationofasub-lineartimesparsedftalgorithm.CommunicationsinMathematicalSciences,5,2007.[Iwe08]M.Iwen.AAFFT(AnnAr-borFastFourierTransform).,2008.[Iwe10a]M.A.Iwen.Combinatorialsublinear-timefourieralgorithms.FoundationsofComputationalMathemat-ics,10:303{338,2010.[Iwe10b]M.A.Iwen.Improvedapproximationguaranteesforsublinear-timefourieralgorithms.ArxivpreprintarXiv:1010.0014,2010.[KKL88]J.Kahn,G.Kalai,andN.Linial.Thein\ruenceofvariablesonbooleanfunctions.FOCS,1988.[KM91]E.KushilevitzandY.Mansour.Learningdecisiontreesusingthefourierspectrum.STOC,1991.[LMN93]N.Linial,Y.Mansour,andN.Nisan.Con-stantdepthcircuits,fouriertransform,andlearnability.JournaloftheACM(JACM),1993.[LVS11]MengdaLin,A.P.Vinod,andChongMengSamsonSee.Anew\rexiblelterbankforlowcomplexityspectrumsensingincognitiveradios.JournalofSignalProcessingSystems,62(2):205{215,2011.[Man92]Y.Mansour.Randomizedinterpolationandap-proximationofsparsepolynomials.ICALP,1992.[MNL10]AbdullahMueen,SumanNath,andJieLiu.Fastapproximatecorrelationformassivetime-seriesdata.InSIGMODConference,pages171{182,2010.[O'D08]R.O'Donnell.Sometopicsinanalysisofbooleanfunctions(tutorial).STOC,2008.[OSB99]A.Oppenheim,R.W.Schafer,andJ.R.Buck.Discrete-timesignalprocessing.UpperSaddleRiver,N.J.:PrenticeHall.ISBN0-13-754920-2.,1999.[PS10]E.PoratandM.Strauss.Sublineartime,measurement-optimal,sparserecoveryforall.Manuscript,2010.[Smi11]JuliusO.Smith.SpectralAudioSignalProcessing,October2008Draft.,accessed2011-07-11.onlinebook.