/
Aninformation�maximisationapproachtoblindseparationandblinddeconyJ�Bel Aninformation�maximisationapproachtoblindseparationandblinddeconyJ�Bel

AninformationmaximisationapproachtoblindseparationandblinddeconyJBel - PDF document

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
375 views
Uploaded On 2015-09-01

AninformationmaximisationapproachtoblindseparationandblinddeconyJBel - PPT Presentation

PleasesendcommentstotonysalkeduThispaperwillappearasalComputation ThereferenceforthisversionisThnicalReportnoINC ebruary InstituteforNeuralComputationUCSDSanDiegoCA ID: 120304

PleasesendcommentstotonysalkeduThispaperwillappearasalComputation   ThereferenceforthisversionisThnicalReportnoINC ebruary InstituteforNeuralComputationUCSDSanDiegoCA

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Aninformationmaximisationapproachtoblin..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

AninformationmaximisationapproachtoblindseparationandblinddeconyJBellandTerrenceJSejnoComputationalNeurobiologyLaboratoryTheSalkInstituteNTorreyPinesRoadLaJollaCaliforniaederiveanewselforganisinglearningalgorithmwhichmaximisestheinformationtransferredinanetorkofnonlinearunitsThealgorithmdoesnotassumeanyknowledgeoftheinputdistributionsandisde nedhereforthezeronoiselimitUndertheseconditionsinformationmaximisationhasextrapropertiesnotfoundinthelinearcaseer  ThenonlinearitiesinthetransferfunctionareabletokuphigherordermomentsoftheinputdistributionsandperformsomethingakintotrueredundancyreductionbeteenunitsintheoutputrepresentationThisenablesthenetorktoseparatestatisticallyindependentcomponentsintheinputs ahigherordergeneralisationofPrincipalComponentsAnalysiseapplythenetorktothesourceseparation orcocktailpartproblemsuccessfullyseparatingunknownmixturesofuptotenspeakersWealsoshowthatavtonthenetorkarchitectureisabletoperformblinddeconolution cancellationofunknownechoesanderberationinaspeechsignal ederivedependenciesofinformationtransferontimedelaysWesuggestthatinformationmaximisationprovidesaunifyingframeworkforproblemsinblindsignalprocessing PleasesendcommentstotonysalkeduThispaperwillappearasalComputation   ThereferenceforthisversionisThnicalReportnoINC ebruary InstituteforNeuralComputationUCSDSanDiegoCA     calledIndependentComponentAnalysis ICA whichenablesthenetorktoetheblindseparationtaskThepaperisorganisedasfollowsSectiondescribesthenewinformationsationlearningalgorithmappliedrespectivelytoasingleinputanmappingacausallterasystemwithtimedelaysandaexibleSectiondescribestheblindseparationandblinddeconSectiondiscussestheconditionsunderwhichtheinformationsationprocesscanndfactorialcodes performICA andthereforeetheseparationanddeconolutionproblemsSectionpresentsresultsontheseparationanddeconolutionofspeechsignalsSectionattemptstoplacethetheoryandresultsinthecontextofpreviousworkandmentionsthelimitationsoftheapproacAbriefreportofthisresearchappearsinBellSejnowski  InformationmaximisationThebasicproblemtackledhereishowtomaximisethemutualinformationthattheoutputofaneuralnetorkprocessorcontainsaboutitsinputThisisdenedasYX istheenyoftheoutputwhile iswhateverentheoutputhaswhichdidntcomefromtheinputInthecasethatwehaenonoise orratherwedontknowwhatisnoiseandwhatissignalintheinput themappingbetisdeterministicand hasitslopossiblevalueitdivergestoThisdivergenceisoneoftheconsequencesofthegeneralisationofinformationtheorytoconuousvariablesWhatw isreallythedi erentialenyofwithrespecttosomereferencehasthenoise levelortheaccuracyofourdiscretisationofthevariablesinoidsuchcomplexitiesweconsiderhereonlytheofinformationtheoreticquantitieswithrespecttosomeparameterinourhgradientsareaswell behaedasdiscrete variableenbecausethereferencetermsinedinthedenitionofdi erentialendisappearTheaboeequationcanbedi erentiatedasfollowswithrespect seethediscussioninHaykin bchapteralsoCoerThomaschapter  oroneinputandoneoutputWhenwepassasingleinputthroughatransformingfunction togivanoutputvbothyx and aremaximisedwhenwealignhighdensitypartsoftheabilitydensityfunction pdf ofwithhighlyslopingpartsofthefunction Thisistheideaofmatchinganeuronsinput outputfunctiontotheexpecteddistributionofsignalsthatwendin Laughlin SeeFigaforanillustration ismonotonicallyincreasingordecreasing iehasauniqueerse thepdfoftheoutput canbewrittenasafunctionofthepdfoftheinput  Papouliseq   yxwherethebarsdenoteabsolutevTheenyoftheoutput isenb  lndenotesexpectedvalueSubstituting  into  giv Thesecondtermontheright theenyof maybeconsideredtobeuna ectedbyalterationsinaparameterThereforeinordertomaximisetheenyofeneedonlyconcentrateonsingthersttermwhichistheaeragelogofhowtheinputa ectstheoutputThiscanbedonebyconsideringthetrainingsetofstoapprothedensit andderivinganonlinestochasticgradientascentlearning w  x x Inthecaseofthelogistictransferfunction uuw   oranConsideranetorkwithaninputvawtmatrixabiasvandamonotonicallytransformedoutputv Analo gouslyto  themariateprobabilitydensityfunctionofcanbewrittenapouliseq   istheabsolutevalueoftheJacobianofthetransformationJacobianisthedeterminantofthematrixofpartialderivdet  ThederivationproceedsasintheprevioussectionexceptinsteadofmaximiyxnoemaximiselnThislatterquanyrepresentsthelogofthevolumeofspaceintowhichpointsinaremappedBymaximiitweattempttospreadourtrainingsetof pointsevenlyinorsigmoidalunitswithbeingthelogistic  theresultinglearningrulesarefamiliarinform proofgivenintheAppendix exceptthatnoarevectors isavectorofones isamatrixandtheanti HebbiantermhasbecomeanouterproductTheanytermhasgeneralisedtoantheinerseofthetransposeofthewtmatrixForanindividualwthisruleamoun wherecoftheis timesthedeterminantofthematrixobtainedbyremovingthethrowandthethcolumnfromThisruleisthesameastheoneforthesingleunitmappingexceptthatinsteadofbeinganunstablepointofthedynamicsnowanydegenerate tmatrixisunstablesincedetifisdegenerateThisfactenablesdi erentoutputunitstolearntorepresentdi erentthingsintheinputWhenthewectorsenteringtooutputunitsbecometoosimilarbecomessmallandthenaturaldynamicoflearningcausesthesewectorstodivergefromeachotherThise ectismediatedbythentorcofWhenthiscofactorbecomessmallitindicatesthatthereisadegeneracyinthewtmatrixoftheofthelaer iethosewtsnotassociatedwithinputoroutput InthiscaseanydegeneracyinlesstodowiththespecicwthatweareadjustingFurtherdiscussionoftheconergenceconditionsofthisrule intermsofhigher ordermomenisdeferredtosectionTheutilityofthisruleforperformingblindseparationisdemonstratedinsectionoracausallterItisnotnecessarytorestrictourarchitecturetowthetoppartofFigbinwhichatimeseries oflengthisconwithacausallterwofimpulseresponse togiveanoutputtime whichisthenpassedthroughanon linearfunctiontogivecanwritethissystemeitherasaconolutionorasamatrixequation inwhicarevectorsofthewholetimeseriesandisamatrixWhenthelteringiscausalwillbeloertriangular orwtswithtimedelaConsiderawwithatimedelaandasigmoidalnon linearitsothatecanmaximisetheenyofwithrespecttothetimedelaagainbsingthelogslopeofasin   d Thecrucialstepinthisderivationistorealisethat td Callingthisquanysimplyemaythenwrite Ourgeneralruleisthereforegivenasfollo jyj y y dwx isthetanhfunctionforexamplethisyieldsthefollowingruleforadaptingthetimedelaxyThisruleholdsregardlessofthearchitectureinwhichthenetorkisembeddedanditislocalunliketherulein  ItbearsaresemblancetotheruleproposedbyPlattFaggin  foradjustabletimedelaysinthenethitectureofJuttenHerault  TherulehasanineinifthereisnoreasontoadjustthedelaSecondlytherulemaximisesthedelivoftheinputsstabilisingwhenAsanexampleifedsevusoidalinputsofthesamefrequencyanddi erentphaseeachwithitswnadjustabletimedelathenthetimedelayswouldadjustuntilthephasesofthetime delaedinputswereallthesameThenforeachinputbeproportionaltotanh sinouldbezero 0 1 uy=g(u)p=1, r=1 u 0 1 uy=g(u)p=5, r=5u 0 1 y=g(u)p=0.2, r=0.2 dy/duu dy/du (a) (b)(c) FigureThegeneralisedlogisticsigmoid toprow of  anditsslope bottomrow for a  b and c Comparetheslopeof b withthepdfinFigaitprovidesagoodmatchfornaturalspeechsignalskgroundtoblindseparationandblindBlindseparationandblinddeconolutionarerelatedproblemsinsignalpro cessingInblindsepasintroducedbyHeraultJutten  andillustratedinFigaasetofsourcess  di erentpeoplespeak ingmusicetc aremixedtogetherlinearlybyamatrixedonotknoythingaboutthesourcesorthemixingprocessAllwereceivearethesuperpositionsofthemxThetaskistorecoertheoriginalsourcesbyndingasquarematrixwhichisapermutationandrescalingoftheinerseoftheunknownmatrixTheproblemhasalsobeencalledthecocyproblemblinddedescribedin Haykina andillustratedinFigbasingleunknownsignal isconedwithanunknowntappedy linelteragivingacorruptedsignal whereistheimpulseresponseofthelterThetaskistorecoycon thoughfornowweignoretheproblemofsignalpropagationdela beteenoutputsintroducedbythemixingmatrix"andblinddeconlutionbecomestheproblemofremovingfromtheconedsignal anstatisticaldependenciesacrosstimeintroducedbythecorruptinglterTheformerprocessthelearningofiscalledtheproblemofIndependenComponentAnalysisor Comon Thelatterprocessthelearning issometimescalledthe HenceforthweusethetermdundancyrwhenwemeaneitherICAorthewhiteningofatimeIneithercaseitisclearinaninformation theoreticcontextthatsecond orderstatisticsareinadequateforreducingredundancybecausetheminformationbeteentariablesinesstatisticsofallordersexceptinthespecialcasethatthevariablesarejointlygaussianInthevariousapproachesintheliteraturethehigher orderstatisticsre quiredforredundancyreductionhaebeenaccessedintomainwysTherstwyistheexplicitestimationofcumtsandpolyspectraSeeComon  andHatzinakosNikias  fortheapplicationofthisapproachtoseparationanddeconolutionrespectivThedraksofsuchdirecttecniquesarethattheycansometimesbecomputationallyineandmaybeinaccuratewhencumtshigherthanthorderareignoredastheyusuallyareItiscurrentlynotclearwhydirectapproachescanbesurprisinglysuc cessfuldespiteerrorsintheestimationofthecumtsandintheusageofthesecumtstoestimatemutualinformationThesecondmainwyofaccessinghigher orderstatisticsisthroughtheuseofstaticnon linearfunctionsTheTylorseriesexpansionsofthesenon linearitiesyieldhigher ordertermsThehopeingeneralisthatlearningrulescontainingsuchtermswillbesensitivetotherighthigher orderstatisticsnecessarytoperformICAorwhiteningSuchreasoninghasbeenusedtojustifyboththeHerault Jutten orH J approachtoblindseparation Comonetal andtheso calledBussgangapproachestoblinddeconolution Bellini Thedrakhereisthatthereisnoguaranteethatthehigher orderstatisticsyieldedbythenon linearitiesarewtedinawyrelatingtothecalculationofstatisticaldependencyortheH Jalgorithmthestandardhistotrydi erentnon linearitiesondi erentproblemstoseeiftheyitwouldbeofbenettohaesomemethodofrigorouslylink ingourchoiceofastaticnon linearitytoalearningruleperforminggradientinsomequanyrelatingtostatisticaldependencyBecauseofthein x2x1y1y2 x2x2f ( )f ( )x1x1 x1y1=y2x2 x1 x1x2 y1y2x2 (a)(b)(c)FigureAnexampleofwhenjointenymaximisationfailstoyieldstatis ticallyindependentcomponents a Twoindependentinputvhavinguniform at pdfsareinputintoanenymaximisationnet orkwithsigmoidaloutputsBecausetheinputpdfsarenotwhedtothenon linearitthediagonalsolution c hashigherjointenythantheindependent componentsolution b despiteitshavingnon zeromutualin formationbeteentheoutputsThevaluesgivenareforillustrationpurposesInmanypracticalsituationshoersuchinterferencewillhaeminimale ectWeconjecturethatonlywhenthepdfsoftheinputsare meaningtheirkurtosisorthorderstandardisedcumtislessthan yuntedhigherenysolutionsforlogisticsigmoidnetorksbefoundycombininginputsinthewyshowninc KenjiDoapersonalcommyreal worldanalogsignalsincludingthespeechsignalswusedaresuper gaussianTheyhaelongertailsandaremoresharplypeakthangaussians seeFig orsuchsignalsinourexperiencemaximithejointenyinsimplelogisticsigmoidalnetorksalwysminimisestheutualinformationbeteentheoutputs seetheresultsinsection ecantailorconditionssothatthemutualinformationbeteenoutputsisedbyconstructingournon linearfunction sothatitmatcinthesenseof  theknownpdfsoftheindependenthisisthecase willbemaximised meaning willbetheatunit MethodsandresultsTheexperimentspresentedherewereobtainedusingsecondsegmentsofspeechrecordedfromvariousspeakers onlyonespeakerperrecording AllsignalsweresampledatkHzfromtheoutputoftheauxiliarymicrophoneofaSparc workstationNospecialpost processingwasperformedontheeformsotherthanthatofnormalisingtheiramplitudessotheywereappro priateforusewithournetorks inputvaluesroughlybeteen and Themethodoftrainingwasstochasticgradientascentbutbecauseofthecostlymatrixinersionin  wtswereusuallyadjustedbasedonthesummedsofsmallbatchesoflengthwhereBatchtrainingwmadeecientusingvectorisedcodewritteninMATLABToensurethattheinputensemblewasstationaryintimethetimeindexofthesignalswasper utedThismeansthatateachiterationofthetrainingthenetorkweinputfromarandomtimepoinariouslearningratesereused wastypical ItwashelpfultoreducethelearningrateduringlearningforconergencetogoodsolutionsBlindseparationresultsThearchitectureinFigaandthealgorithmin  and  wassucientoperformblindseparationArandommixingmatrixasgeneratedwithvaluesusuallyuniformlydistributedbeteen andThiswasusedtoethemixedtimeseriesfromtheoriginalsourcesThematricesthenwerebothmatrices timepoints andconstructedfromy permutingthetimeindexoftoproduceand  creatingthemixturesngbythemixingmatrixTheunmixingmatrixandthebiasverethentrainedAnexamplerunwithvesourcesisshowninFigThemixturesformedanincomprehensiblebabbleThisunmixedsolutionwasreachedafteraroundtimepointswerepresentedequivttoaboutpassesthroughthecompletetimeseriesthoughmhoftheimprotoccurredontherstfewpassesthroughthedataAnyresidualinterferenceinisinaudible Thelearningrateisdenedastheproportionalityconstantin    and    Thistookontheorderof minutesonaSparcTwundreddatapointswtedatatimeinabatchthenthewtswerechangedwithalearningrateofbasedonthesumoftheaccumulated couslaughteragongandtheHallelujahchorus weresuccessfullyseparatedthoughthenetuningofthesolutiontookmanyhoursandrequiredsomeannealingofthelearningrate loeringitwithtime Fortosourcescongenceisnormallyacedinlessthanonepassthroughthedata datapoints andonaSparc on linelearningcanoccurattwicethespeedaththesoundsthemselvesareplaedReal timeseparationformorethansourcesmayrequirefurtherworktospeedconergenceorspecial purposehardwInallourattemptsatblindseparationthealgorithmhasonlyfailedunderoconditionswhenmorethanoneofthesourcesweregaussianwhitenoiseandwhenthemixingmatrixasalmostsingularBothareunderstandableFirstlynoprocedurecanseparateoutindependengaussiansourcessincethesumoftogaussianvariableshasitselfagaussiandistributionSecondlyifisalmostsingularthenanyunmixingustalsobealmostsingularmakingthelearningin  quiteunstableinthevicinitofasolutionIncontrastwiththeseresultsourexperiencewithtestsontheH JnetofJuttenHerault  hasbeenthatitoccasionallyfailstoconergeforosourcesandonlyrarelyconergesforthreeonthesamespeechandmsignalsweusedforseparatingtensourcesCohenAndreou  reportseparationofuptosixsinusoidalsignalsofdi erentfrequenciesusinganalogVLSIH JnetInadditioninCohenAndreou  theyreportresultswithmixedsinewesandnoiseinxnetorksbutnoseparationresultsformorethantospeakwdoesconergencetimescalewiththenberofsources#Thedif yinansweringthisquestionisthatdi erentlearningratesarerequiredfordi erenandfordi erentstagesofconergenceWeexpecttoaddressthisissueinfutureworkandemployusefulheuristicorexplicitndorderhniques Battiti tospeedconornowwepresentroughestimatesforthenberofepochs eachcontainingdatavectors re quiredtoreachanaeragesignaltonoiseratioontheouputchannelsofdBtsuchalevelapproximately$oftheeachoutputchannelamplitudeisotedtoonesignalTheseresultswerecollectedformixingmatricesofunittsothatconergencewouldnotbehamperedbyhavingtond 4 5 6 -1 5 1 3 4 6 9 11 4 1 4 7 13 16 19 22 6 -0.751 1 9 -0.75 1 2 5 7 9 13 15 10 2 5 7 9 13 15 10 WHITENINGBARREL EFFECTMANY ECHOEStaskno. oftapstap 1 ( = 0.125ms)10 ( = 1.25ms)100 ( = 12.5ms) learntdeconvolv-ing filter‘w’idealdeconvolv-ing filter‘w ’idealconvolving‘a’w * a (a) 0.8 1 (e)(i)(b)(f)(j)(c)(g) 6 8 -44 (k)(d)(h) 2 6 7 8 4 FigureBlinddeconolutionresults aei Filtersusedtoconespeecsignals bfj theirinerses cgk deconolvinglterslearnythealgo rithmand dhl conolutionoftheconolvinganddeconolvingltersSeetextforfurtherexplanationarespacedoutfurtherasine lthereislessopportunityforsimplewhiteningInthesecondexampleeamsechoisaddedtothesignalcreatesamildaudiblebarrele ectBecauseltereisniteinlengthitsersefisinniteinlengthshownheretruncatedTheinertingltertingresemblesitthoughtheresemblancetailso toardstheleftsincewearereallylearninganoptimallterofnitelengthnotatruncatedinnitelterTheresultingdeconolutionhisquitegoodThecleanestresultsthoughcomewhentheidealdeconolvinglterisof Anexampleofthebarreleectaretheacousticechoesheardwhensomeonetalksinaspeak ehaeperformedexperimentswithspeechsignalsinwhichsignalshabeensimultaneouslyseparatedanddeconedusingtheseruleseusedmixturesoftosignalswithconolutionlterslikethoseineandiandergencetoseparateddeconedspeecasalmostperfectewillconsiderthesetechniquesrstlyinthecontextofpreviousinformationtheoreticapproacheswithinneuralnetorksandtheninthecontextofrelatedhestoblindsignalprocessingproblemsyauthorshaeformulatedoptimalitycriteriasimilartooursforbothneuralnetorksandsensorysystems BarlowAkBialekRu dermanZee HoerourworkismostsimilartothatofLinskwhoinproposedaninfomaxprincipleforlinearmappingswithvformsofnoiseLinsker  derivedalearningalgorithmformaximisingtheutualinformationbeteentolaersofanetorkThisinfomaxcriterionisthesameasoursseeeq HoertheproblemasformulatedhereistinthefollowingrespectsThereisnonoiseorratherthereisnonoiseinthissystemThereisnoassumptionthatinputsoroutputshaegaussianstatisticsThetransferfunctionisingeneralnon linearThesedi erencesleadtoquiteadi erentlearningruleLinskersruleuses forinputsignalandoutput aHebbiantermtomaximise whenthenetorkreceivesbothsignalandnoiseananti Hebbiantermtominim whenthesystemreceivesonlynoiseandananti HebbianlateralteractiontodecorrelatetheoutputsWhenthenetorkisdeterministicerthe termdoesnotcontributeAdeterministiclinearnetcanincreaseitsinformationthroughputwithoutboundasthein  suggests ComparisonwithpreviousworkonblindseparaAsindicatedinsectionapproachestoblindseparationandblinddecontionhaedividedintothoseusingnon linearfunctions JuttenHeraultBellini andthoseusingexplicitcalculationsofcumtsandpolyspectra ComonHatzinakosNikias WehaeshownthataninformationsationapproachcanprovideatheoreticalframeworkforapproachesoftheformertypeInthecaseofblindseparationthearchitectureofouralthoughitisafeedforwardnetorkmapsdirectlyontothatoftherecurrenHerault JuttennetorkTherelationshipbeteenourwtmatrixandtheH JrecurrentmatrixcanbewrittenasistheidenymatrixFromthiswemaywritesothatourlearningrule  formspartofarulefortherecurrentH JorkUnfortunatelythisruleiscomplexandnotobviouslyrelatedtothenon linearanti HebbianruleproposedfortheH Jnetareoddnon linearfunctionsItremainstoconductadetailedperformancecomparisonbeteen  andthealgorithmpresentedhereWeperformedmanysimulationsinwhichtheH Jnetfailedtoconergebutbecausethereissubstantialfreedominthechoiceofin  wecannotbesurethatourchoicesweregoodonesenowcomparetheconergencecriteriaofthetoalgorithmsinordertowhowtheyarerelatedTheexplanation JuttenHerault forthesuccessoftheH JnetorkisthattheTylorseriesexpansionof  yieldsoddcrossmomentssuchthatthewtsstopchangingwhenijpqforalloutputunitpairsforpqandforthecoecienijpqcomingfromtheTylorseriesexpansionofThistheyarguevidesanapproximationofanindependencetest appliedavtoftheH JarchitecturetoodorseparationinamodeloftheolfactorybulbComparisonwithpreviousworkonblinddeconInthecaseofblinddeconolutionourapproachmostresemblestheBussgangfamilyoftechniques BelliniHaykin Thesealgorithmsassumesomeknowledgeabouttheinputdistributionsinordertosculptanon linearithmaybeusedinthecreationofamemorylessconditionalestimatorfortheinputsignalInournotationthenon linearlytransformedoutputisexactlythisconditionalestimatorandthegoalofthesystemistochangewtsuntheactualoutputisthesameasourestimateofAnerroristhusdenederrorandastochasticwtupdaterulefollowsdirectlyfromgradientdescentinmean squarederrorThisgivestheblinddeconolutionruleforatappeddelatattimecomparewith   tanh thenthisruleisverysimilarto  Theonlydi erenceisthat  containsthetermtanh where  hastheterm butascanbeeasilyveriedthesetermsareofthesamesignatalltimessothealgorithmsshouldbehaesimilarlyThetheoreticaljusticationsfortheBussgangapproachesarehoeralittleobscureandaswiththeHerault JuttenrulespartoftheirappealderivfromthefactthattheyhaebeenshowntoworkinmanycircumstancesTheprimarydicultyliesintheconsideration  ofasaconditionalestimatorWhapriorishouldweconsideranon linearlytransformedoutputtobeaconditionalestimatorfortheunconedinput#TheanswercomesfromesianconsiderationsTheoutputisconsideredtobeanoisyversionoftheoriginalsignalModelsofthepdfsoftheoriginalsignalandthisnoisearethenconstructedandBaesianreasoningyieldsanon linearconditionalestimatorofwhichcanbequitecomplexsee  inHaykinItisnotclearhoerthatthenoiseintroducedbytheconolvinglter inequation  isdecidedlynon localEachneuronmustknowthecofactorseitherofallthewtsenteringitorallthoseleavingitSomearckmaybefoundwhichenablesinformationmaximisationtotakeplaceusingonlylocalinformationTheexistenceoflocallearningrulessuchastheH Jnetorksuggeststhatitmaybepossibletodeveloplocallearningrulesximatingthenon localonesderivedhereFornowhoerthenetlearningrulein  remainsunDespitetheseconcernswebelievethattheinformationmaximisationap hpresentedherecouldserveasaunifyingframeworkthatbringstogethererallinesofresearchandasaguidingprincipleforfurtheradvancesTheprinciplesmayalsobeappliedtoothersensorymodalitiessuchasvisionwhereField  hasrecentlyarguedthatphase insensitiveinformationmaximisa tion usingonlysecondorderstatistics isunabletopredictlocal non FeeldsAppendixproofoflearningrule  Consideranetorkwithaninputvawtmatrixabiasvandanon linearlytransformedoutputvisasquarematrixandisaninertiblefunctionthemydensityfunctionofcanbewritten Papouliseq   istheabsolutevalueoftheJacobianofthetransformationsimpliestotheproductofthedeterminantofthewtmatrixandtheoftheoutputswithrespecttotheirnetinputs detorexampleinthecasewherethenon linearityisthelogisticsigmoid uiandyi ecanperformgradientascentintheinformationthattheoutputstrans mitaboutinputsbynotingthattheinformationgradientisthesameasthe Thegeneralisedcumegaussianfunction E hasavariableexpo Thiscanbevariedbeteenandtoproducesquashingfunctionssuitableforsymmetricalinputdistributionswithveryhighorwkurtosisisverylargethen issuitableforunitin putdistributionssuchasthoseinFigWhenclosetozeroittshighkurtosisinputdistributionswhicharepeakedwithlongtailsAnalogouslyitispossibletodeneageneralisedtanhsigmoid F ofhthehyperbolictangent B isaspecialcase  ThevoffunctionFcaningeneralonlybeattainedbumericalin inbothdirections ofthedi erentialequation fromaboundaryconditionof  Oncethisisdonehoerandthealuesarestoredinalook uptabletheslopeandanti HebbtermsareeasilyevaluatedateachpresentationAgainasinsectionitshouldbeusefulfordatawhichmayhaeat  orpeaky  pdfsThelearningruleforagaussianradialbasisfunctionnode G showstheyofnon monotonicfunctionsforinformationmaximisationtermonthedenominatorwouldmakesuchlearningunstablewhenthenetinputtotheunitwaszero ThisresearcassupportedbyagrantfromtheOceofNaalResearchWaremhindebtedtoNicolSchraudolphwhonotonlysuppliedtheoriginalideainFigandsharedhisunpublishedcalculations schraudolphetal butalsoprovideddetailedcriticismateverystageoftheworkManyhelpfulationsalsocamefromPaulViolaBarakPutterKenjiDoaMishaTsodyksAlexandrePougetPeterDaanOlivierCoenenandIrisGinzburgAkJJCouldinformationtheoryprovideanecologicaltheoryofsensoryprocessing# AkJJRedlichANContalgorithmforsensoryrecep eelddevalComputation BaramYRothZMulti dimensionaldensityshapingbysig moidalnetorkswithapplicationtoclassicationestimationandfore castingCISreportnoOctoberCentreforIntsystemsDeptofComputerScienceThnionIsraelInstofTHaifasubmittedforpublicationBarlowHBPossibleprinciplesunderlyingthetransformationofsensorymessagesinSensoryCommunicRosenblithWA ed MITBarlowHBUnsupervisedlearningalComputation BarlowHBF akPAdaptationanddecorrelationinthecor texinDurbinRetal eds TheComputingNeurp Addison BattitiRFirst andsecond ordermethodsforlearningbetsteepestdescentandNewtonsmethodalComputation BecerSHintonGEAself organisingneuralnetorkthatdis erssurfacesinrandom dotstereograms  HaykinSdaptivelterthendedPrenHaykinSBlindequalisationformulatedasaself organizedlearningprocessdingsofthethAsilomarceonsignalssystemsandcacicGroeCAHaykinS ed aBlindDePrentice HallNewJerseyHaykinS ed balnetworksehensivefoundationMacMillanNewYHeraultJJuttenCSpaceortimeadaptivesignalprocessingbneuralnetorkmodelsinDenkerJS ed alnetworksforcingAIPceprdings AmericanInstituteforphysicsNewHopeldJJOlfactorycomputationandobjectperceptionNatlAadSciUSAolpp JuttenCHeraultJBlindseparationofsourcespartIanadap ealgorithmbasedonneuromimeticarcSignalprKarhunenJJoutsensaloJRepresentationandseparationofsignalsusingnon linearPCAtypelearningalnetworks LaughlinSAsimplecodingprocedureenhancesaneuronsinfor mationcapacitZNaturforsch LinskerRAnapplicationoftheprincipleofmaximuminforma tionpreservationtolinearsystemsinesinNeuralInformationessingSystems ouretzkyDS ed Morgan Kau manLiSSejnowskiTJAdaptiveseparationofmixedbroadbandsoundsourceswithdelaysbyabeamformingHerault JuttennetIEEEJournalofOcanicEngine LinskerRLocalsynapticlearningrulessucetomaximiseminformationinalinearnetalComputation  VittozEAArreguitXCMOSintegrationofHerault JuttencellsforseparationofsourcesinMeadCIsmailM eds gVLSIimplementationofneuralsystemsp KluerBostonYellinDWeinsteinECriteriaformhannelsignalseparationIEEEtransactionsonsignalprolno 

Related Contents


Next Show more