/
Derivation of Backpropagation  Introduction Figure  Neural network processing Conceptually Derivation of Backpropagation  Introduction Figure  Neural network processing Conceptually

Derivation of Backpropagation Introduction Figure Neural network processing Conceptually - PDF document

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
570 views
Uploaded On 2014-12-12

Derivation of Backpropagation Introduction Figure Neural network processing Conceptually - PPT Presentation

The weights on the connec tions between neurons mediate the passed values in both dire ctions The Backpropagation algorithm is used to learn the weights o f a multilayer neural network with a 64257xed architecture It performs gradient descent to try ID: 22415

The weights the

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Derivation of Backpropagation Introduct..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

DerivationofBackpropagation1IntroductionFigure1:NeuralnetworkprocessingConceptually,anetworkforwardpropagatesactivationtoproduceanoutputanditbackwardpropagateserrortodetermineweightchanges(asshowninFigure1).Theweightsontheconnec-tionsbetweenneuronsmediatethepassedvaluesinbothdirections.TheBackpropagationalgorithmisusedtolearntheweightsofamultilayerneuralnetworkwitha xedarchitecture.Itperformsgradientdescenttotrytominimizethesumsquarederrorbetweenthenetwork'soutputvaluesandthegiventargetvalues.Figure2depictsthenetworkcomponentswhicha ectaparticularweightchange.Noticethatallthenecessarycomponentsarelocallyrelatedtotheweightbeingupdated.Thisisonefeatureofbackpropagationthatseemsbiologicallyplausible.However,brainconnectionsappeartobeunidirectionalandnotbidirectionalaswouldberequiredtoimplementbackpropagation.2NotationForthepurposeofthisderivation,wewillusethefollowingnotation:Thesubscriptkdenotestheoutputlayer.Thesubscriptjdenotesthehiddenlayer.Thesubscriptidenotestheinputlayer.1 Figure2:Thechangetoahiddentooutputweightdependsonerror(depictedasalinedpattern)attheoutputnodeandactivation(depictedasasolidpattern)atthehiddennode.Whilethechangetoainputtohiddenweightdependsonerroratthehiddennode(whichinturndependsonerroratalltheoutputnodes)andactivationattheinputnode.wkjdenotesaweightfromthehiddentotheoutputlayer.wjidenotesaweightfromtheinputtothehiddenlayer.adenotesanactivationvalue.tdenotesatargetvalue.netdenotesthenetinput.3ReviewofCalculusRulesd(eu)dx=eududxd(g+h)dx=dgdx+dhdxd(gn)dx=ngn1dgdx4GradientDescentonErrorWecanmotivatethebackpropagationlearningalgorithmasgradientdescentonsum-squarederror(wesquaretheerrorbecauseweareinterestedinitsmagnitude,notitssign).Thetotalerrorinanetworkisgivenbythefollowingequation(the12willsimplifythingslater).E=12Xk(tkak)2Wewanttoadjustthenetwork'sweightstoreducethisoverallerror.W/@E@WWewillbeginattheoutputlayerwithaparticularweight.2 wkj/@E@wkjHowevererrorisnotdirectlyafunctionofaweight.Weexpandthisasfollows.wkj="@E@ak@ak@netk@netk@wkjLet'sconsidereachofthesepartialderivativesinturn.NotethatonlyonetermoftheEsummationwillhaveanon-zeroderivative:theoneassociatedwiththeparticularweightweareconsidering.4.1Derivativeoftheerrorwithrespecttotheactivation@E@ak=@(12(tkak)2)@ak=(tkak)Nowweseewhythe12intheEtermwasuseful.4.2Derivativeoftheactivationwithrespecttothenetinput@ak@netk=@(1+enetk)1@netk=enetk(1+enetk)2We'dliketobeabletorewritethisresultintermsoftheactivationfunction.Noticethat:111+enetk=enetk1+enetkUsingthisfact,wecanrewritetheresultofthepartialderivativeas:ak(1ak)4.3DerivativeofthenetinputwithrespecttoaweightNotethatonlyonetermofthenetsummationwillhaveanon-zeroderivative:againtheoneassociatedwiththeparticularweightweareconsidering.@netk@wkj=@(wkjaj)@wkj=aj4.4WeightchangeruleforahiddentooutputweightNowsubstitutingtheseresultsbackintoouroriginalequationwehave:wkj="kz}|{(tkak)ak(1ak)ajNoticethatthislooksverysimilartothePerceptronTrainingRule.Theonlydi erenceistheinclusionofthederivativeoftheactivationfunction.Thisequationistypicallysimpli edasshownbelowwherethetermrepesentstheproductoftheerrorwiththederivativeoftheactivationfunction.wkj="kaj3 4.5WeightchangeruleforaninputtohiddenweightNowwehavetodeterminetheappropriateweightchangeforaninputtohiddenweight.Thisismorecomplicatedbecauseitdependsontheerroratallofthenodesthisweightedconnectioncanleadto.wji/[Xk@E@ak@ak@netk@netk@aj]@aj@netj@netj@wji="[Xkkz}|{(tkak)ak(1ak)wkj]aj(1aj)ai="jz}|{[Xkkwkj]aj(1aj)aiwji="jai4