/
Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y
Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y

Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y - PDF document

marina-yarberry
marina-yarberry . @marina-yarberry
Follow
118 views | Public

Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y - Description

Ng Computer Science Department Stanford University Stanford CA 94305 hlleechaituang csstanfordedu Abstract Motivated in part by the hierarchical organization of the cortex a number of al gorithms have recently been proposed that try to learn hierarc ID: 8036 Download Pdf

Tags :

Computer Science Department

Please download the presentation from below link :


Download Pdf - The PPT/PDF document "Sparse deep belief net model for visual ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Share:

Link:

Embed:

Presentation on theme: "Sparse deep belief net model for visual area V Honglak Lee Chaitanya Ekanadham Andrew Y"— Presentation transcript

Inrelatedwork,severalstudieshavecomparedmodelssuchasthese,aswellasnon-hierarchical/non-deeplearningalgorithms,totheresponsepropertiesofneuronsinareaV1.AstudybyvanHaterenandvanderSchaaf[8]showedthatthelterslearnedbyindependentcomponentsanalysis(ICA)[9]onnaturalimagedatamatchverywellwiththeclassicalreceptiveeldsofV1simplecells.(Filterslearnedbysparsecoding[10,11]alsosimilarlygiveresponsessimilartoV1simplecells.)OurworktakesinspirationfromtheworkofvanHaterenandvanderSchaaf,andrepresentsastudythatisdoneinasimilarspirit,onlyextendingthecomparisonstoadeeperareainthecorticalhierarchy,namelyvisualareaV2.2Biologicalcomparison2.1Featuresinearlyvisualcortex:areaV1TheselectivityofneuronsfororientedbarstimuliincorticalareaV1hasbeenwelldocumented[12,13].ThereceptiveeldofsimplecellsinV1arelocalized,oriented,bandpassltersthatresemblegaborlters.Severalauthorshaveproposedmodelsthathavebeeneitherformallyorinformallyshowntoreplicatethegabor-likepropertiesofV1simplecells.Manyofthesealgorithms,suchas[10,9,8,6],computea(approximatelyorexactly)sparserepresentationofthenaturalstimulidata.Theseresultsareconsistentwiththe“efcientcodinghypothesis”whichpositsthatthegoalofearlyvisualprocessingistoencodevisualinformationasefcientlyaspossible[14].Somehierarchicalextensionsofthesemodels[15,6,16]areabletolearnfeaturesthataremorecomplexthansimpleorientedbars.Forexample,hierarchicalsparsemodelsofnaturalimageshaveaccountedforcomplexcellreceptiveelds[17],topography[18,6],colinearityandcontourcoding[19].Othermodels,suchas[20],havealsobeenshowntogiveV1complexcell-likeproperties.2.2FeaturesinvisualcortexareaV2Itremainsunknowntowhatextentthepreviouslydescribedalgorithmscanlearnhigherorderfea-turesthatareknowntobeencodedfurtherdowntheventralvisualpathway.Inaddition,theresponsepropertiesofneuronsincorticalareasreceivingprojectionsfromareaV1(e.g.,areaV2)arenotnearlyaswelldocumented.ItisuncertainwhattypeofstimulicauseV2neuronstorespondopti-mally[21].OneV2studyby[22]reportedthatthereceptiveeldsinthisareaweresimilartothoseintheneighboringareasV1andV4.TheauthorsinterpretedtheirndingsassuggestivethatareaV2mayserveasaplacewheredifferentchannelsofvisualinformationareintegrated.However,quantitativeaccountsofresponsesinareaV2arefewinnumber.Intheliterature,weidentiedtwosetsofquantitativedatathatgiveusagoodstartingpointformakingmeasurementstodeterminewhetherouralgorithmsmaybecomputingsimilarfunctionsasareaV2.Inoneofthesestudies,ItoandKomatsu[7]investigatedhowV2neuronsrespondedtoangularstim-uli.Theysummarizedeachneuron'sresponsewithatwo-dimensionalvisualizationofthestimulisetcalledanangleprole.Bymakingseveralaxialmeasurementswithintheprole,theauthorswereabletocomputevariousstatisticsabouteachneuron'sselectivityforanglewidth,angleori-entation,andforeachseparatelinecomponentoftheangle(seeFigure1).Approximately80%oftheneuronsrespondedtospecicanglestimuli.Theyfoundneuronsthatwereselectiveforonlyonelinecomponentofitspeakangleaswellasneuronsselectiveforbothlinecomponents.TheseneuronsyieldedangleprolesresemblingthoseofCell2andCell5inFigure1,respectively.Inaddition,severalneuronsexhibitedahighamountofselectivityforitspeakangleproducingangleproleslikethatofCell1inFigure1.Noneuronswerefoundthathadmoreelongationinadi-agonalaxisthaninthehorizontalorverticalaxes,indicatingthatneuronsinV2werenotselectiveforanglewidthororientation.Therefore,animportantconclusionmadefrom[7]wasthataV2neuron'sresponsetoananglestimulusishighlydependentonitsresponsestoeachindividuallinecomponentoftheangle.Whilethedependencewasoftenobservedtobesimplyadditive,aswasthecasewithneuronsyieldingproleslikethoseofCells1and2inFigure1(right),thiswasnotalwaysthecase.29neuronshadverysmallpeakresponseareasandyieldedproleslikethatofCell1inFigure1(right),thusindicatingahighlyspecictuningtoananglestimulus.WhiletheformerresponsessuggestasimplelinearcomputationofV1neuralresponses,thelatterresponsessuggestanonlinearcomputation[21].Theanalysismethodsadoptedin[7]areveryusefulincharacterizingtheresponseproperties,andweusethesemethodstoevaluateourownmodel.AnotherstudybyHegdeandVanEssen[23]studiedtheresponsesofapopulationofV2neuronstocomplexcontourandgratingstimuli.TheyfoundseveralV2neuronsrespondingmaximallyforangles,andthedistributionofpeakanglesfortheseneuronsisconsistentwiththatfoundby[7].Inaddition,severalV2neuronsrespondedmaximallyforshapessuchasintersections,tri-stars,ve-pointstars,circles,andarcsofvaryinglength.2 Here,N()isthegaussiandensity,andlogistic()isthelogisticfunction.Fortrainingtheparametersofthemodel,theobjectiveistomaximizethelog-likelihoodofthedata.Wealsowanthiddenunitactivationstobesparse;thus,weaddaregularizationtermthatpenalizesadeviationoftheexpectedactivationofthehiddenunitsfroma(low)xedlevelp.2Thus,givenatrainingsetfv(1);:::;v(m)gcomprisingmexamples,weposethefollowingoptimizationproblem:minimizefwij;ci;bjg�Pml=1logPhP(v(l);h(l))+Pnj=1jp�1 mPml=1E[h(l)jjv(l)]j2;(4)whereE[]istheconditionalexpectationgiventhedata,isaregularizationconstant,andpisaconstantcontrollingthesparsenessofthehiddenunitshj.Thus,ourobjectiveisthesumofalog-likelihoodtermandaregularizationterm.Inprinciple,wecanapplygradientdescenttothisproblem;however,computingthegradientofthelog-likelihoodtermisexpensive.Fortunately,thecontrastivedivergencelearningalgorithmgivesanefcientapproximationtothegradientofthelog-likelihood[25].Buildinguponthis,oneachiterationwecanapplythecontrastivedivergenceupdaterule,followedbyonestepofgradientdescentusingthegradientoftheregularizationterm.3ThedetailsofourprocedurearesummarizedinAlgorithm1. Algorithm1SparseRBMlearningalgorithm 1.Updatetheparametersusingcontrastivedivergencelearningrule.Morespecically,wij:=wij+ (hvihjidata�hvihjirecon)ci:=ci+ (hviidata�hviirecon)bj:=bj+ (hhjidata�hhjirecon);where isalearningrate,andhireconisanexpectationoverthereconstructiondata,estimatedusingoneiterationofGibbssampling(asinEquations2,3).2.Updatetheparametersusingthegradientoftheregularizationterm.3.RepeatSteps1and2untilconvergence. 3.2LearningdeepnetworksusingsparseRBMOncealayerofthenetworkistrained,theparameterswij;bj;ci'sarefrozenandthehiddenunitvaluesgiventhedataareinferred.Theseinferredvaluesserveasthe“data”usedtotrainthenexthigherlayerinthenetwork.Hintonetal.[1]showedthatbyrepeatedlyapplyingsuchaprocedure,onecanlearnamultilayereddeepbeliefnetwork.Insomecases,thisiterative“greedy”algorithmcanfurtherbeshowntobeoptimizingavariationalboundonthedatalikelihood,ifeachlayerhasatleastasmanyunitsasthelayerbelow(althoughinpracticethisisnotnecessarytoarriveatadesirablesolution;see[1]foradetaileddiscussion).Inourexperimentsusingnaturalimages,welearnanetworkwithtwohiddenlayers,witheachlayerlearnedusingthesparseRBMalgorithmdescribedinSection3.1.4Visualization4.1Learning“strokes”fromhandwrittendigits Figure2:BaseslearnedfromMNISTdataWeappliedthesparseRBMalgorithmtotheMNISThandwrittendigitdataset.4WelearnedasparseRBMwith69visibleunitsand200hiddenunits.ThelearnedbasesareshowninFigure2.(EachbasiscorrespondstoonecolumnoftheweightmatrixWleft-multipliedbytheunwhiteningmatrix.)Manybasesfoundbytheal-gorithmroughlyrepresentdifferent“strokes”ofwhichhandwrittendigitsarecomprised.Thisisconsistent 2Lessformally,thisregularizationensuresthatthe“ringrate”ofthemodelneurons(correspondingtothelatentrandomvariableshj)arekeptatacertain(fairlylow)level,sothattheactivationsofthemodelneuronsaresparse.Similarintuitionwasalsousedinothermodels(e.g.,seeOlshausenandField[10]).3Toincreasecomputationalefciency,wemadeoneadditionalchange.Notethattheregularizationtermisdenedusingasumovertheentiretrainingset;ifweusestochasticgradientdescentormini-batches(smallsubsetsofthetrainingdata)toestimatethisterm,itresultsinbiasedestimatesofthegradient.Toamelioratethis,weusedmini-batches,butinthegradientstepthattriestominimizetheregularizationterm,weupdateonlythebiastermsbj's(whichdirectlycontrolthedegreetowhichthehiddenunitsareactivated,andthustheirsparsity),insteadofupdatingalltheparametersbjandwij's.4Downloadedfromhttp://yann.lecun.com/exdb/mnist/.Eachpixelwasnormalizedtotheunitinterval,andweusedPCAwhiteningtoreducethedimensionto69principalcomponentsforcomputationalefciency.(Similarresultswereobtainedwithoutwhitening.)4 Figure3:400rstlayerbaseslearnedfromthevanHaterennaturalimagedataset,usingouralgorithm. Figure4:Visualizationof200secondlayerbases(modelV2receptiveelds),learnedfromnaturalimages.Eachsmallgroupof3-5(arrangedinarow)imagesshowsonemodelV2unit;theleftmostpatchinthegroupisavisualizationofthemodelV2basis,andisobtainedbytakingaweightedlinearcombinationoftherstlayer“V1”basestowhichitisconnected.ThenextfewpatchesinthegroupshowtherstlayerbasesthathavethestrongestweightconnectiontothemodelV2basis.withresultsobtainedbyapplyingdifferentalgorithmstolearnsparserepresentationsofthisdataset(e.g.,[2,5]).4.2LearningfromnaturalimagesWealsoappliedthealgorithmtoatrainingsetasetof14-by-14naturalimagepatches,takenfromadatasetcompiledbyvanHateren.5WelearnedasparseRBMmodelwith196visibleunitsand400hiddenunits.ThelearnedbasesareshowninFigure3;theyareoriented,gabor-likebasesandresemblethereceptiveeldsofV1simplecells.64.3Learningatwo-layermodelofnaturalimagesusingsparseRBMsWefurtherlearnedatwo-layernetworkbystackingonesparseRBMontopofanother(seeSec-tion3.2fordetails.)7Afterlearning,thesecondlayerweightswerequitesparse—mostoftheweightswereverysmall,andonlyafewwereeitherhighlypositiveorhighlynegative.Positive 5Theimageswereobtainedfromhttp://hlab.phys.rug.nl/imlib/index.html.Weused100,00014-by-14imagepatchesrandomlysampledfromanensembleof2000images;eachsubsetof200patcheswasusedasamini-batch.6Mostotherauthors'experimentstodateusingregular(non-sparse)RBMs,whentrainedonsuchdata,seemtohavelearnedrelativelydiffuse,unlocalizedbases(onesthatdonotrepresentorientededgelters).Whilesensitivetotheparametersettingsandrequiringalongtrainingtime,wefoundthatitispossibleinsomecasestogetaregularRBMtolearnorientededgelterbasesaswell.Butinourexperiments,eveninthesecaseswefoundthatrepeatingthisprocesstobuildatwolayerdeepbeliefnet(seeSection4.3)didnotencodeasignicantnumberofcorners/angles,unlikeonetrainedusingthesparseRBM;therefore,itshowedsignicantlyworsematchtotheIto&Komatsustatistics.Forexample,thefractionofmodelV2neuronsthatrespondstronglytoapairofedgesnearrightangles(formally,havepeakangleintherange60-120degrees)was2%fortheregularRBM,whereasitwas17%forthesparseRBM(andIto&Komatsureported22%).SeeSection5.1formoredetails.7Fortheresultsreportedinthispaper,wetrainedthesecondlayersparseRBMwithreal-valuedvisibleunits;however,theresultswereverysimilarwhenwetrainedthesecondlayersparseRBMwithbinary-valuedvisibleunits(exceptthatthesecondlayerweightsbecamelesssparse).5 Figure6:Imagesshowdistributionsoverstimulusresponsestatistics(averagedover10trials)fromouralgo-rithm(blue)andindatatakenfrom[7](green).Theveguresshowrespectively(i)thedistributionoverpeakangleresponse(rangingfrom0to180degrees;eachbinrepresentsarangeof30degrees),(ii)distributionovertolerancetoprimarylinecomponent(Figure1C,indominantverticalorhorizontaldirection),(iii)distributionovertolerancetosecondarylinecomponent(Figure1C,innon-dominantdirection),(iv)tolerancetoanglewidth(Figure1D),(v)tolerancetoangleorientation(Figure1E).SeeFigure1caption,and[7],fordetails. Figure7:VisualizationofanumberofmodelV2neuronsthatmaximallyrespondtovariouscomplexstimuli.EachrowofsevenimagesrepresentsoneV2basis.Ineachrow,theleftmostimageshowsalinearcombinationofthetopthreeweightedV1componentsthatcomprisetheV2basis;thenextthreeimagesshowthetopthreeoptimalstiimuli;andthelastthreeimagesshowthetopthreeweightedV1bases.TheV2basesshownintheguresmaximallyrespondtoacuteangles(left),obtuseangles(middle),andtri-starsandjunctions(right).5.2ComplexshapedmodelV2neuronsOursecondexperimentrepresentsacomparisontoasubsetoftheresultsdescribedinHegdeandvanEssen[23].Wegeneratedastimulussetcomprisingsome[23]'scomplexshapedstimuli:angles,singlebars,tri-stars(threelinesegmentsthatmeetatapoint),andarcs/circles,andmeasuredtheresponseofthesecondlayerofoursparseRBMmodeltothesestimuli.11WeobservethatmanyV2basesareactivatedmainlybyoneofthesedifferentstimulusclasses.Forexample,somemodelV2neuronsactivatemaximallytosinglebars;somemaximallyactivateto(acuteorobtuse)angles;andotherstotri-stars(seeFigure7).Further,thenumberofV2basesthataremaximallyactivatedbyacuteanglesissignicantlylargerthanthenumberofobtuseangles,andthenumberofV2basesthatrespondmaximallytothetri-starswasmuchsmallerthanbothprecedingcases.Thisisalsoconsistentwiththeresultsdescribedin[23].6ConclusionsWepresentedasparsevariantofthedeepbeliefnetworkmodel.Whentrainedonnaturalimages,thismodellearnslocal,oriented,edgeltersintherstlayer.Moreinterestingly,thesecondlayercapturesavarietyofbothcolinear(“contour”)featuresaswellascornersandjunctions,thatinaquantitativecomparisontomeasurementsofV2takenbyIto&Komatsu,appearedtogiveresponsesthatweresimilaralongseveraldimensions.ThisbynomeansindicatesthatthecortexisasparseRBM,butperhapsismoresuggestiveofcontours,cornersandjunctionsbeingfundamentaltothestatisticsofnaturalimages.12Nonetheless,webelievethattheseresultsalsosuggestthatsparse 11Allthestimuliwere14-by-14pixelimagepatches.WeappliedtheprotocoldescribedinSection5.1tothestimulusdata,tocomputethemodelV1andV2responses.12Inpreliminaryexperiments,wealsofoundthatwhentheseideasareappliedtoself-taughtlearning[26](inwhichonemayuseunlabeleddatatoidentifyfeaturesthatarethenusefulforsomesupervisedlearningtask),usingatwo-layersparseRBMusuallyresultsinsignicantlybetterfeaturesforobjectrecognitionthanusingonlyaone-layernetwork.7