Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y - PDF document

112K - views

Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y

Ng Computer Science Department Stanford University Stanford CA 94305 hlleechaituang csstanfordedu Abstract Motivated in part by the hierarchical organization of the cortex a number of al gorithms have recently been proposed that try to learn hierarc

Embed :
Pdf Download Link

Download Pdf - The PPT/PDF document "Sparse deep belief net model for visual ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y






Presentation on theme: "Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y"— Presentation transcript:

Inrelatedwork,severalstudieshavecomparedmodelssuchasthese,aswellasnon-hierarchical/non-deeplearningalgorithms,totheresponsepropertiesofneuronsinareaV1.AstudybyvanHaterenandvanderSchaaf[8]showedthatthelterslearnedbyindependentcomponentsanalysis(ICA)[9]onnaturalimagedatamatchverywellwiththeclassicalreceptiveeldsofV1simplecells.(Filterslearnedbysparsecoding[10,11]alsosimilarlygiveresponsessimilartoV1simplecells.)OurworktakesinspirationfromtheworkofvanHaterenandvanderSchaaf,andrepresentsastudythatisdoneinasimilarspirit,onlyextendingthecomparisonstoadeeperareainthecorticalhierarchy,namelyvisualareaV2.2Biologicalcomparison2.1Featuresinearlyvisualcortex:areaV1TheselectivityofneuronsfororientedbarstimuliincorticalareaV1hasbeenwelldocumented[12,13].ThereceptiveeldofsimplecellsinV1arelocalized,oriented,bandpassltersthatresemblegaborlters.Severalauthorshaveproposedmodelsthathavebeeneitherformallyorinformallyshowntoreplicatethegabor-likepropertiesofV1simplecells.Manyofthesealgorithms,suchas[10,9,8,6],computea(approximatelyorexactly)sparserepresentationofthenaturalstimulidata.Theseresultsareconsistentwiththe“efcientcodinghypothesis”whichpositsthatthegoalofearlyvisualprocessingistoencodevisualinformationasefcientlyaspossible[14].Somehierarchicalextensionsofthesemodels[15,6,16]areabletolearnfeaturesthataremorecomplexthansimpleorientedbars.Forexample,hierarchicalsparsemodelsofnaturalimageshaveaccountedforcomplexcellreceptiveelds[17],topography[18,6],colinearityandcontourcoding[19].Othermodels,suchas[20],havealsobeenshowntogiveV1complexcell-likeproperties.2.2FeaturesinvisualcortexareaV2Itremainsunknowntowhatextentthepreviouslydescribedalgorithmscanlearnhigherorderfea-turesthatareknowntobeencodedfurtherdowntheventralvisualpathway.Inaddition,theresponsepropertiesofneuronsincorticalareasreceivingprojectionsfromareaV1(e.g.,areaV2)arenotnearlyaswelldocumented.ItisuncertainwhattypeofstimulicauseV2neuronstorespondopti-mally[21].OneV2studyby[22]reportedthatthereceptiveeldsinthisareaweresimilartothoseintheneighboringareasV1andV4.TheauthorsinterpretedtheirndingsassuggestivethatareaV2mayserveasaplacewheredifferentchannelsofvisualinformationareintegrated.However,quantitativeaccountsofresponsesinareaV2arefewinnumber.Intheliterature,weidentiedtwosetsofquantitativedatathatgiveusagoodstartingpointformakingmeasurementstodeterminewhetherouralgorithmsmaybecomputingsimilarfunctionsasareaV2.Inoneofthesestudies,ItoandKomatsu[7]investigatedhowV2neuronsrespondedtoangularstim-uli.Theysummarizedeachneuron'sresponsewithatwo-dimensionalvisualizationofthestimulisetcalledanangleprole.Bymakingseveralaxialmeasurementswithintheprole,theauthorswereabletocomputevariousstatisticsabouteachneuron'sselectivityforanglewidth,angleori-entation,andforeachseparatelinecomponentoftheangle(seeFigure1).Approximately80%oftheneuronsrespondedtospecicanglestimuli.Theyfoundneuronsthatwereselectiveforonlyonelinecomponentofitspeakangleaswellasneuronsselectiveforbothlinecomponents.TheseneuronsyieldedangleprolesresemblingthoseofCell2andCell5inFigure1,respectively.Inaddition,severalneuronsexhibitedahighamountofselectivityforitspeakangleproducingangleproleslikethatofCell1inFigure1.Noneuronswerefoundthathadmoreelongationinadi-agonalaxisthaninthehorizontalorverticalaxes,indicatingthatneuronsinV2werenotselectiveforanglewidthororientation.Therefore,animportantconclusionmadefrom[7]wasthataV2neuron'sresponsetoananglestimulusishighlydependentonitsresponsestoeachindividuallinecomponentoftheangle.Whilethedependencewasoftenobservedtobesimplyadditive,aswasthecasewithneuronsyieldingproleslikethoseofCells1and2inFigure1(right),thiswasnotalwaysthecase.29neuronshadverysmallpeakresponseareasandyieldedproleslikethatofCell1inFigure1(right),thusindicatingahighlyspecictuningtoananglestimulus.WhiletheformerresponsessuggestasimplelinearcomputationofV1neuralresponses,thelatterresponsessuggestanonlinearcomputation[21].Theanalysismethodsadoptedin[7]areveryusefulincharacterizingtheresponseproperties,andweusethesemethodstoevaluateourownmodel.AnotherstudybyHegdeandVanEssen[23]studiedtheresponsesofapopulationofV2neuronstocomplexcontourandgratingstimuli.TheyfoundseveralV2neuronsrespondingmaximallyforangles,andthedistributionofpeakanglesfortheseneuronsisconsistentwiththatfoundby[7].Inaddition,severalV2neuronsrespondedmaximallyforshapessuchasintersections,tri-stars,ve-pointstars,circles,andarcsofvaryinglength.2 Here,N()isthegaussiandensity,andlogistic()isthelogisticfunction.Fortrainingtheparametersofthemodel,theobjectiveistomaximizethelog-likelihoodofthedata.Wealsowanthiddenunitactivationstobesparse;thus,weaddaregularizationtermthatpenalizesadeviationoftheexpectedactivationofthehiddenunitsfroma(low)xedlevelp.2Thus,givenatrainingsetfv(1);:::;v(m)gcomprisingmexamples,weposethefollowingoptimizationproblem:minimizefwij;ci;bjg�Pml=1logPhP(v(l);h(l))+Pnj=1jp�1 mPml=1E[h(l)jjv(l)]j2;(4)whereE[]istheconditionalexpectationgiventhedata,isaregularizationconstant,andpisaconstantcontrollingthesparsenessofthehiddenunitshj.Thus,ourobjectiveisthesumofalog-likelihoodtermandaregularizationterm.Inprinciple,wecanapplygradientdescenttothisproblem;however,computingthegradientofthelog-likelihoodtermisexpensive.Fortunately,thecontrastivedivergencelearningalgorithmgivesanefcientapproximationtothegradientofthelog-likelihood[25].Buildinguponthis,oneachiterationwecanapplythecontrastivedivergenceupdaterule,followedbyonestepofgradientdescentusingthegradientoftheregularizationterm.3ThedetailsofourprocedurearesummarizedinAlgorithm1. Algorithm1SparseRBMlearningalgorithm 1.Updatetheparametersusingcontrastivedivergencelearningrule.Morespecically,wij:=wij+ (hvihjidata�hvihjirecon)ci:=ci+ (hviidata�hviirecon)bj:=bj+ (hhjidata�hhjirecon);where isalearningrate,andhireconisanexpectationoverthereconstructiondata,estimatedusingoneiterationofGibbssampling(asinEquations2,3).2.Updatetheparametersusingthegradientoftheregularizationterm.3.RepeatSteps1and2untilconvergence. 3.2LearningdeepnetworksusingsparseRBMOncealayerofthenetworkistrained,theparameterswij;bj;ci'sarefrozenandthehiddenunitvaluesgiventhedataareinferred.Theseinferredvaluesserveasthe“data”usedtotrainthenexthigherlayerinthenetwork.Hintonetal.[1]showedthatbyrepeatedlyapplyingsuchaprocedure,onecanlearnamultilayereddeepbeliefnetwork.Insomecases,thisiterative“greedy”algorithmcanfurtherbeshowntobeoptimizingavariationalboundonthedatalikelihood,ifeachlayerhasatleastasmanyunitsasthelayerbelow(althoughinpracticethisisnotnecessarytoarriveatadesirablesolution;see[1]foradetaileddiscussion).Inourexperimentsusingnaturalimages,welearnanetworkwithtwohiddenlayers,witheachlayerlearnedusingthesparseRBMalgorithmdescribedinSection3.1.4Visualization4.1Learning“strokes”fromhandwrittendigits Figure2:BaseslearnedfromMNISTdataWeappliedthesparseRBMalgorithmtotheMNISThandwrittendigitdataset.4WelearnedasparseRBMwith69visibleunitsand200hiddenunits.ThelearnedbasesareshowninFigure2.(EachbasiscorrespondstoonecolumnoftheweightmatrixWleft-multipliedbytheunwhiteningmatrix.)Manybasesfoundbytheal-gorithmroughlyrepresentdifferent“strokes”ofwhichhandwrittendigitsarecomprised.Thisisconsistent 2Lessformally,thisregularizationensuresthatthe“ringrate”ofthemodelneurons(correspondingtothelatentrandomvariableshj)arekeptatacertain(fairlylow)level,sothattheactivationsofthemodelneuronsaresparse.Similarintuitionwasalsousedinothermodels(e.g.,seeOlshausenandField[10]).3Toincreasecomputationalefciency,wemadeoneadditionalchange.Notethattheregularizationtermisdenedusingasumovertheentiretrainingset;ifweusestochasticgradientdescentormini-batches(smallsubsetsofthetrainingdata)toestimatethisterm,itresultsinbiasedestimatesofthegradient.Toamelioratethis,weusedmini-batches,butinthegradientstepthattriestominimizetheregularizationterm,weupdateonlythebiastermsbj's(whichdirectlycontrolthedegreetowhichthehiddenunitsareactivated,andthustheirsparsity),insteadofupdatingalltheparametersbjandwij's.4Downloadedfromhttp://yann.lecun.com/exdb/mnist/.Eachpixelwasnormalizedtotheunitinterval,andweusedPCAwhiteningtoreducethedimensionto69principalcomponentsforcomputationalefciency.(Similarresultswereobtainedwithoutwhitening.)4 Figure3:400rstlayerbaseslearnedfromthevanHaterennaturalimagedataset,usingouralgorithm. Figure4:Visualizationof200secondlayerbases(modelV2receptiveelds),learnedfromnaturalimages.Eachsmallgroupof3-5(arrangedinarow)imagesshowsonemodelV2unit;theleftmostpatchinthegroupisavisualizationofthemodelV2basis,andisobtainedbytakingaweightedlinearcombinationoftherstlayer“V1”basestowhichitisconnected.ThenextfewpatchesinthegroupshowtherstlayerbasesthathavethestrongestweightconnectiontothemodelV2basis.withresultsobtainedbyapplyingdifferentalgorithmstolearnsparserepresentationsofthisdataset(e.g.,[2,5]).4.2LearningfromnaturalimagesWealsoappliedthealgorithmtoatrainingsetasetof14-by-14naturalimagepatches,takenfromadatasetcompiledbyvanHateren.5WelearnedasparseRBMmodelwith196visibleunitsand400hiddenunits.ThelearnedbasesareshowninFigure3;theyareoriented,gabor-likebasesandresemblethereceptiveeldsofV1simplecells.64.3Learningatwo-layermodelofnaturalimagesusingsparseRBMsWefurtherlearnedatwo-layernetworkbystackingonesparseRBMontopofanother(seeSec-tion3.2fordetails.)7Afterlearning,thesecondlayerweightswerequitesparse—mostoftheweightswereverysmall,andonlyafewwereeitherhighlypositiveorhighlynegative.Positive 5Theimageswereobtainedfromhttp://hlab.phys.rug.nl/imlib/index.html.Weused100,00014-by-14imagepatchesrandomlysampledfromanensembleof2000images;eachsubsetof200patcheswasusedasamini-batch.6Mostotherauthors'experimentstodateusingregular(non-sparse)RBMs,whentrainedonsuchdata,seemtohavelearnedrelativelydiffuse,unlocalizedbases(onesthatdonotrepresentorientededgelters).Whilesensitivetotheparametersettingsandrequiringalongtrainingtime,wefoundthatitispossibleinsomecasestogetaregularRBMtolearnorientededgelterbasesaswell.Butinourexperiments,eveninthesecaseswefoundthatrepeatingthisprocesstobuildatwolayerdeepbeliefnet(seeSection4.3)didnotencodeasignicantnumberofcorners/angles,unlikeonetrainedusingthesparseRBM;therefore,itshowedsignicantlyworsematchtotheIto&Komatsustatistics.Forexample,thefractionofmodelV2neuronsthatrespondstronglytoapairofedgesnearrightangles(formally,havepeakangleintherange60-120degrees)was2%fortheregularRBM,whereasitwas17%forthesparseRBM(andIto&Komatsureported22%).SeeSection5.1formoredetails.7Fortheresultsreportedinthispaper,wetrainedthesecondlayersparseRBMwithreal-valuedvisibleunits;however,theresultswereverysimilarwhenwetrainedthesecondlayersparseRBMwithbinary-valuedvisibleunits(exceptthatthesecondlayerweightsbecamelesssparse).5 Figure6:Imagesshowdistributionsoverstimulusresponsestatistics(averagedover10trials)fromouralgo-rithm(blue)andindatatakenfrom[7](green).Theveguresshowrespectively(i)thedistributionoverpeakangleresponse(rangingfrom0to180degrees;eachbinrepresentsarangeof30degrees),(ii)distributionovertolerancetoprimarylinecomponent(Figure1C,indominantverticalorhorizontaldirection),(iii)distributionovertolerancetosecondarylinecomponent(Figure1C,innon-dominantdirection),(iv)tolerancetoanglewidth(Figure1D),(v)tolerancetoangleorientation(Figure1E).SeeFigure1caption,and[7],fordetails. Figure7:VisualizationofanumberofmodelV2neuronsthatmaximallyrespondtovariouscomplexstimuli.EachrowofsevenimagesrepresentsoneV2basis.Ineachrow,theleftmostimageshowsalinearcombinationofthetopthreeweightedV1componentsthatcomprisetheV2basis;thenextthreeimagesshowthetopthreeoptimalstiimuli;andthelastthreeimagesshowthetopthreeweightedV1bases.TheV2basesshownintheguresmaximallyrespondtoacuteangles(left),obtuseangles(middle),andtri-starsandjunctions(right).5.2ComplexshapedmodelV2neuronsOursecondexperimentrepresentsacomparisontoasubsetoftheresultsdescribedinHegdeandvanEssen[23].Wegeneratedastimulussetcomprisingsome[23]'scomplexshapedstimuli:angles,singlebars,tri-stars(threelinesegmentsthatmeetatapoint),andarcs/circles,andmeasuredtheresponseofthesecondlayerofoursparseRBMmodeltothesestimuli.11WeobservethatmanyV2basesareactivatedmainlybyoneofthesedifferentstimulusclasses.Forexample,somemodelV2neuronsactivatemaximallytosinglebars;somemaximallyactivateto(acuteorobtuse)angles;andotherstotri-stars(seeFigure7).Further,thenumberofV2basesthataremaximallyactivatedbyacuteanglesissignicantlylargerthanthenumberofobtuseangles,andthenumberofV2basesthatrespondmaximallytothetri-starswasmuchsmallerthanbothprecedingcases.Thisisalsoconsistentwiththeresultsdescribedin[23].6ConclusionsWepresentedasparsevariantofthedeepbeliefnetworkmodel.Whentrainedonnaturalimages,thismodellearnslocal,oriented,edgeltersintherstlayer.Moreinterestingly,thesecondlayercapturesavarietyofbothcolinear(“contour”)featuresaswellascornersandjunctions,thatinaquantitativecomparisontomeasurementsofV2takenbyIto&Komatsu,appearedtogiveresponsesthatweresimilaralongseveraldimensions.ThisbynomeansindicatesthatthecortexisasparseRBM,butperhapsismoresuggestiveofcontours,cornersandjunctionsbeingfundamentaltothestatisticsofnaturalimages.12Nonetheless,webelievethattheseresultsalsosuggestthatsparse 11Allthestimuliwere14-by-14pixelimagepatches.WeappliedtheprotocoldescribedinSection5.1tothestimulusdata,tocomputethemodelV1andV2responses.12Inpreliminaryexperiments,wealsofoundthatwhentheseideasareappliedtoself-taughtlearning[26](inwhichonemayuseunlabeleddatatoidentifyfeaturesthatarethenusefulforsomesupervisedlearningtask),usingatwo-layersparseRBMusuallyresultsinsignicantlybetterfeaturesforobjectrecognitionthanusingonlyaone-layernetwork.7