We expect that the SHG strongly depends on the resonance that is excited Obviously the incident polariza tion and the detuning of the laser wavelength from the resonance are of particular interest One possibility for control ling the detuning is to ID: 77147
Download Pdf The PPT/PDF document "The setup for measuring the SHG is descr..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
ThesetupformeasuringtheSHGisdescribedinthesupportingonlinematerial().WeexpectthattheSHGstronglydependsontheresonancethatisexcited.Obviously,theincidentpolariza-tionandthedetuningofthelaserwavelengthfromtheresonanceareofparticularinterest.Onepossibilityforcontrollingthedetuningistochangethelaserwavelengthforagivensample,whichisdifficultbecauseoftheextremelybroadtuningrangerequired.Thus,wefollowanalternativeroute,lithographictuning(inwhichtheincidentlaserwavelengthof1.5m,aswellasthedetectionsystem,remainsfixed),andtunetheresonancepositionsbychangingtheSRRsize.Inthismanner,wecanalsoguaranteethattheopticalpropertiesoftheSRRconstituentmaterialsareidenticalforallconfigurations.ThebluebarsinFig.1summarizethemeasuredSHGsignals.Forexcitationofthe Fig.3.Theory,presentedastheexperiment(seeFig.1).TheSHGsourceisthemagneticcompo-nentoftheLorentzforceonmetalelectronsintheSRRs. 28JULY2006VOL313www.sciencemag.org totransformthehigh-dimensionaldataintoalow-dimensionalcodeandasimilardecoderrnetworktorecoverthedatafromthecode.Startingwithrandomweightsinthetwonetworks,theycanbetrainedtogetherbyminimizingthediscrepancybetweentheorig-inaldataanditsreconstruction.Therequiredgradientsareeasilyobtainedbyusingthechainruletobackpropagateerrorderivativesfirstthroughthedecodernetworkandthenthroughtheencodernetwork().ThewholesystemiscalledanautoencoderderandisdepictedinFig.1.Itisdifficulttooptimizetheweightsinnonlinearautoencodersthathavemultiplehiddenlayers().Withlargeinitialweights,autoencoderstypicallyfindpoorlocalminima;withsmallinitialweights,thegradientsintheearlylayersaretiny,makingitinfeasibletotrainautoencoderswithmanyhiddenlayers.Iftheinitialweightsareclosetoagoodsolution,gradientdescentworkswell,butfindingsuchinitialweightsrequiresaverydifferenttypeofalgorithmthatlearnsonelayeroffeaturesatatime.Weintroducethishisprocedureforbinarydata,generalizeittoreal-valueddata,andshowthatitworkswellforavarietyofdatasets.Anensembleofbinaryvectors(e.g.,im-ages)canbemodeledusingatwo-layernet-workcalledarestrictedBoltzmannmachinene(RBM)()inwhichstochastic,binarypixelsareconnectedtostochastic,binaryfeaturedetectorsusingsymmetricallyweightedcon-nections.ThepixelscorrespondtovisibleleunitsoftheRBMbecausetheirstatesareobserved;thefeaturedetectorscorrespondtohiddenenunits.Ajointconfiguration()ofthevisibleandhiddenunitshasanenergy(givenbypixelsfeatureswherearethebinarystatesofpixelandfeaturearetheirbiases,andistheweightbetweenthem.Thenetworkas-signsaprobabilitytoeverypossibleimageviathisenergyfunction,asexplainedin().Theprobabilityofatrainingimagecanberaisedby DepartmentofComputerScience,UniversityofToronto,6KingsCollegeRoad,Toronto,OntarioM5S3G4,Canada.*Towhomcorrespondenceshouldbeaddressed;E-mail:hinton@cs.toronto.edu PretrainingUnrolling Fig.1.PretrainingconsistsoflearningastackofrestrictedBoltzmannmachines(RBMs),eachhavingonlyonelayeroffeaturedetectors.ThelearnedfeatureactivationsofoneRBMareusedasthedatafortrainingthenextRBMinthestack.Afterthepretraining,theRBMsareunrolledtocreateadeepautoencoder,whichisthenfine-tunedusingbackpropagationoferrorderivatives.Fig.2.)Toptobottom:Randomsamplesofcurvesfromthetestdataset;reconstructionsproducedbythesix-dimensionaldeepautoencoder;reconstruc-tionsbylogisticPCA()usingsixcomponents;reconstructionsbylogisticPCAandstandardPCAusing18components.Theaveragesquarederrorperim-ageforthelastfourrowsis1.44,7.64,2.45,5.90.()Toptobottom:Arandomtestimagefromeachclass;reconstructionsbythe30-dimensionalautoen-coder;reconstructionsby30-dimensionallogisticPCAandstandardPCA.Theaveragesquarederrorsforthelastthreerowsare3.00,8.01,and13.87.)Toptobottom:Randomsamplesfromthetestdataset;reconstructionsbythe30-dimensionalautoencoder;reconstructionsby30-dimensionalPCA.Theaveragesquarederrorsare126and135. www.sciencemag.orgVOL31328JULY2006 adjustingtheweightsandbiasestolowertheenergyofthatimageandtoraisetheenergyofsimilar,ilar,imagesthatthenetworkwouldprefertotherealdata.Givenatrainingimage,thebinarystateofeachfeaturede-tectorissetto1withprobability),where)isthelogisticfunctionexp(isthebiasofisthestateofpixel,andistheweightbetweenand.Oncebinarystateshavebeenchosenforthehiddenunits,aconfabulationnisproducedbysettingeachto1withprobability),whereisthebiasof.Thestatesofthehiddenunitsarethenupdatedoncemoresothattheyrepresentfeaturesoftheconfabula-tion.Thechangeinaweightisgivenbydatareconwhereisalearningrate,dataisthefractionoftimesthatthepixelandfeaturedetectorareontogetherwhenthefeaturedetectorsarebeingdrivenbydata,andreconisthecorrespondingfractionforconfabulations.Asimplifiedversionofthesamelearningruleisusedforthebiases.Thelearningworkswelleventhoughitisnotexactlyfollowingthegradientofthelogprobabilityofthetrainingdata(Asinglelayerofbinaryfeaturesisnotthebestwaytomodelthestructureinasetofim-ages.Afterlearningonelayeroffeaturede-tectors,wecantreattheiractivitieswhentheyarebeingdrivenbythedataasdataforlearningasecondlayeroffeatures.ThefirstlayeroffeaturedetectorsthenbecomethevisibleunitsforlearningthenextRBM.Thislayer-by-layerlearningcanberepeatedasmanyFig.3.)Thetwo-dimensionalcodesfor500digitsofeachclassproducedbytakingthefirsttwoprin-cipalcomponentsofall60,000trainingimages.)Thetwo-dimensionalcodesfoundbya784-1000-500-250-2autoen-coder.Foranalternativevisualization,see( Fig.4.)Thefractionofretrieveddocumentsinthesameclassasthequerywhenaquerydocumentfromthetestsetisusedtoretrieveothertestsetdocuments,averagedoverall402,207possibleque-ries.()Thecodesproducedbytwo-dimensionalLSA.(Thecodesproducedbya2000-500-250-125-2autoencoder. 28JULY2006VOL313www.sciencemag.org timesasdesired.Itcanbeshownthataddinganextralayeralwaysimprovesalowerboundonthelogprobabilitythatthemodelassignstothetrainingdata,providedthenumberoffeaturedetectorsperlayerdoesnotdecreaseandtheirweightsareinitializedcorrectly().Thisbounddoesnotapplywhenthehigherlayershavefewerfeaturedetectors,butthelayer-by-layerlearningalgorithmisnonethelessaveryeffec-tivewaytopretraintheweightsofadeepauto-encoder.Eachlayeroffeaturescapturesstrong,high-ordercorrelationsbetweentheactivitiesofunitsinthelayerbelow.Forawidevarietyofdatasets,thisisanefficientwaytopro-gressivelyreveallow-dimensional,nonlinearstructure.Afterpretrainingmultiplelayersoffeaturedetectors,themodelisunfoldedfolded(Fig.1)toproduceencoderanddecodernetworksthatinitiallyusethesameweights.Theglobalfine-tuningstagethenreplacesstochasticactivitiesbydeterministic,real-valuedprobabilitiesandusesbackpropagationthroughthewholeauto-encodertofine-tunetheweightsforoptimalreconstruction.Forcontinuousdata,thehiddenunitsofthefirst-levelRBMremainbinary,butthevisibleunitsarereplacedbylinearunitswithGaussiannoise().Ifthisnoisehasunitvariance,thestochasticupdateruleforthehiddenunitsremainsthesameandtheupdateruleforvisibleistosamplefromaGaussianwithunitvarianceandmeanInallourexperiments,thevisibleunitsofeveryRBMhadreal-valuedactivities,whichwereintherange0,1forlogisticunits.WhiletraininghigherlevelRBMs,thevisibleunitsweresettotheactivationprobabilitiesofthehiddenunitsinthepreviousRBM,butthehiddenunitsofeveryRBMexceptthetoponehadstochasticbinaryvalues.ThehiddenunitsofthetopRBMhadstochasticreal-valuedstatesdrawnfromaunitvarianceGaussianwhosemeanwasdeterminedbytheinputfromthatRBMslogisticvisibleunits.Thisallowedthelow-dimensionalcodestomakegooduseofcontinuousvariablesandfacilitatedcompari-sonswithPCA.Detailsofthepretrainingandfine-tuningcanbefoundin(Todemonstratethatourpretrainingalgo-rithmallowsustofine-tunedeepnetworksefficiently,wetrainedaverydeepautoen-coderonasyntheticdatasetcontainingimagesofcurvesvesthatweregeneratedfromthreerandomlychosenpointsintwodi-mensions().Forthisdataset,thetruein-trinsicdimensionalityisknown,andtherelationshipbetweenthepixelintensitiesandthesixnumbersusedtogeneratethemishighlynonlinear.Thepixelintensitiesliebetween0and1andareverynon-Gaussian,soweusedlogisticoutputunitsintheauto-encoder,andthefine-tuningstageofthelearningminimizedthecross-entropyerrorlog(1)log(1,whereistheintensityofpixelandistheintensityofitsreconstruction.Theautoencoderconsistedofanencoderwithlayersofsize(2850-25-6andasymmetricdecoder.Thesixunitsinthecodelayerwerelinearandalltheotherunitswerelogistic.Thenetworkwastrainedon20,000imagesandtestedon10,000newimages.Theautoencoderdiscoveredhowtoconverteach784-pixelimageintosixrealnumbersthatallowalmostperfectreconstruction(Fig.2A).PCAgavemuchworsereconstruc-tions.Withoutpretraining,theverydeepauto-encoderalwaysreconstructstheaverageofthetrainingdata,evenafterprolongedfine-tuning).Shallowerautoencoderswithasinglehiddenlayerbetweenthedataandthecodecanlearnwithoutpretraining,butpretraininggreatlyreducestheirtotaltrainingtime(Whenthenumberofparametersisthesame,deepautoencoderscanproducelowerrecon-structionerrorsontestdatathanshallowones,butthisadvantagedisappearsasthenumberofparametersincreases(Next,weuseda784-1000-500-250-30auto-encodertoextractcodesforallthehand-writtendigitsintheMNISTtrainingset(TheMatlabcodethatweusedforthepre-trainingandfine-tuningisavailablein().Again,allunitswerelogisticexceptforthe30linearunitsinthecodelayer.Afterfine-tuningonall60,000trainingimages,theautoencoderwastestedon10,000newimagesandproducedmuchbetterreconstructionsthandidPCA(Fig.2B).Atwo-dimensionalautoencoderproducedabettervisualizationofthedatathandidthefirsttwoprincipalcomponents(Fig.3).Wealsouseda625-2000-1000-500-30auto-encoderwithlinearinputunitstodiscover30-dimensionalcodesforgrayscaleimagepatchesthatwerederivedfromtheOlivettifacedataset).TheautoencoderclearlyoutperformedPCA(Fig.2C).Whentrainedondocuments,autoencodersproducecodesthatallowfastretrieval.Werep-resentedeachof804,414newswirestories(asavectorofdocument-specificprobabilitiesofthe2000commonestwordstems,andwetraineda2000-500-250-125-10autoencoderonhalfofthestorieswiththeuseofthemulticlasscross-entropyerrorfunctionthefine-tuning.The10codeunitswerelinearandtheremaininghiddenunitswerelogistic.Whenthecosineoftheanglebetweentwocodeswasusedtomeasuresimilarity,theautoencoderclearlyoutperformedlatentsemanticanalysis(LSA)(),awell-knowndocumentretrievalmethodbasedonPCA(Fig.4).Autoencoders)alsooutperformlocallinearembedding,arecentnonlineardimensionalityreductionalgo-rithm(Layer-by-layerpretrainingcanalsobeusedforclassificationandregression.OnawidelyusedversionoftheMNISThandwrittendigitrecogni-tiontask,thebestreportederrorratesare1.6%forrandomlyinitializedbackpropagationand1.4%forsupportvectormachines.Afterlayer-by-layerpretrainingina784-500-500-2000-10network,backpropagationusingsteepestdescentandasmalllearningrateachieves1.2%().Pretraininghelpsgeneralizationbecauseitensuresthatmostoftheinformationintheweightscomesfrommodelingtheimages.Theverylimitedinforma-tioninthelabelsisusedonlytoslightlyadjusttheweightsfoundbypretraining.Ithasbeenobvioussincethe1980sthatbackpropagationthroughdeepautoencoderswouldbeveryeffectivefornonlineardimen-sionalityreduction,providedthatcomputerswerefastenough,datasetswerebigenough,andtheinitialweightswerecloseenoughtoagoodsolution.Allthreeconditionsarenowsatisfied.Unlikenonparametricmethods(autoencodersgivemappingsinbothdirectionsbetweenthedataandcodespaces,andtheycanbeappliedtoverylargedatasetsbecauseboththepretrainingandthefine-tuningscalelinearlyintimeandspacewiththenumberoftrainingcases.ReferencesandNotes1.D.C.Plaut,G.E.Hinton,Comput.SpeechLang.,352.D.DeMers,G.Cottrell,AdvancesinNeuralInformationProcessingSystems5(MorganKaufmann,SanMateo,CA,1993),pp.580587.3.R.Hecht-Nielsen,,1860(1995).4.N.Kambhatla,T.Leen,NeuralComput.,14935.P.Smolensky,ParallelDistributedProcessing:Volume1:,D.E.Rumelhart,J.L.McClelland,Eds.(MITPress,Cambridge,1986),pp.194281.6.G.E.Hinton,NeuralComput.,1711(2002).7.J.J.Hopfield,Proc.Natl.Acad.Sci.U.S.A.,25548.Seesupportingmaterialon9.G.E.Hinton,S.Osindero,Y.W.Teh,NeuralComput.1527(2006).10.M.Welling,M.Rosen-Zvi,G.Hinton,AdvancesinNeuralInformationProcessingSystems17(MITPress,Cambridge,MA,2005),pp.14811488.11.TheMNISTdatasetisavailableathttp://yann.lecun.com/12.TheOlivettifacedatasetisavailableatwww.cs.toronto.edu/roweis/data.html.13.TheReuterCorpusVolume2isavailableathttp://trec.nist.gov/data/reuters/reuters.html.14.S.C.Deerwester,S.T.Dumais,T.K.Landauer,G.W.Furnas,R.A.Harshman,J.Am.Soc.Inf.Sci.,39115.S.T.Roweis,L.K.Saul,,2323(2000).16.J.A.Tenenbaum,V.J.deSilva,J.C.Langford,,2319(2000).17.WethankD.Rumelhart,M.Welling,S.Osindero,andS.Roweisforhelpfuldiscussions,andtheNaturalSciencesandEngineeringResearchCouncilofCanadaforfunding.G.E.H.isafellowoftheCanadianInstituteforAdvancedResearch.SupportingOnlineMaterialwww.sciencemag.org/cgi/content/full/313/5786/504/DC1MaterialsandMethodsFigs.S1toS5MatlabCode20March2006;accepted1June2006 www.sciencemag.orgVOL31328JULY2006