/
Radial Basis Function Networks Introduction Introduction to Neural Networks  Lecture Radial Basis Function Networks Introduction Introduction to Neural Networks  Lecture

Radial Basis Function Networks Introduction Introduction to Neural Networks Lecture - PDF document

faustina-dinatale
faustina-dinatale . @faustina-dinatale
Follow
683 views
Uploaded On 2014-12-20

Radial Basis Function Networks Introduction Introduction to Neural Networks Lecture - PPT Presentation

Bullinaria 2004 1 Introduction to Radial Basis Functions 2 Exact Interpolation 3 Common Radial Basis Functions 4 Radial Basis Function RBF Networks 5 Problems with Exact Interpolation Networks 6 Improving RBF Networks 7 The Improved RBF Network brPa ID: 26810

Bullinaria 2004 Introduction

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Radial Basis Function Networks Introduct..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Radial Basis Function Networks: IntroductionIntroduction to Neural Networks : Lecture 12© John A. Bullinaria, 20041. Introduction to Radial Basis Functions Exact Interpolation3. Common Radial Basis Functions Radial Basis Function (RBF) Networks Problems with Exact Interpolation Networks Improving RBF Networks The Improved RBF Network L12-2Introduction to Radial Basis FunctionsThe idea of Radial Basis Function (RBF) Networks derives from the theory of functionapproximation. We have already seen how Multi-Layer Perceptron (MLP) networkswith a hidden layer of sigmoidal units can learn to approximate functions. RBFNetworks take a slightly different approach. Their main features are:1. They are two-layer feed-forward networks. The hidden nodes implement a set of radial basis functions (e.g. Gaussian functions). The output nodes implement linear summation functions as in an MLP. The network training is divided into two stages: first the weights from the input tohidden layer are determined, and then the weights from the hidden to output layer.5. The training/learning is very fast. The networks are very good at interpolation.WeÕll spend the next three lectures studying the detailsÉ L12-3Exact InterpolationThe exact interpolation of a set of N data points in a multi-dimensional space requiresevery one of the D dimensional input vectors xpipxiD=={:,...,}1 to be mapped ontothe corresponding target output tp. The goal is to find a function f()x such thatftpNpp(),...,x="= 1The radial basis function approach introduces a set of N basis functions, one for each datapoint, which take the form fxx-()p where f×() is some non-linear function whose formwill be discussed shortly. Thus the pth such function depends on the distance xx-p,usually taken to be Euclidean, between x and xp. The output of the mapping is then takento be a linear combination of the basis functions, i.e.fwpppN()xxx=-()=åf1The idea is to find the ÒweightsÓ wp such that the function goes through the data points. L12-4Determining the WeightsIt is easy to determine equations for the weights by combining the above equations:fwtqpqppNq()xxx=-()==åf1We can write this in matrix form by defining the vectors t={}tp and w={}wp, and thematrix FF==-(){}Fpqqpfxx. This simplifies the equation to FF wt=. Then, providedthe inverse of FF exists, we can use any standard matrix inversion techniques to givewt=FF--11 It can be shown that, for a large class of basis functions f×(), the matrix FF is indeed non-singular (and hence invertable) providing the data points are distinct.Once we have the weights, we have a function f()x that represents a continuousdifferentiable surface that passes exactly through each data point. L12-5 Commonly Used Radial Basis FunctionsA range of theoretical and empirical studies have indicated that many properties of theinterpolating function are relatively insensitive to the precise form of the basis functionsfr(). Some of the most commonly used basis functions are:1. Gaussian Functions:fs()exprr=-æçö÷222 width parameter �s 02. Multi-Quadric Functions:fs()/rr=+()2212 parameter �s 03. Generalized Multi-Quadric Functions:fsb()rr=+()22 parameters �s 0,� 1 �b 0 L12-64. Inverse Multi-Quadric Functions:fs()/rr=+()-2212 parameter �s 05. Generalized Inverse Multi-Quadric Functions:fsa()rr=+()-22 parameters �s 0, �a 06. Thin Plate Spline Function:f()ln()rrr=27. Cubic Function:f()rr=38. Linear Function:f()rr= L12-7Properties of the Radial Basis FunctionsThe Gaussian and Inverse Multi-Quadric Functions are localised in the sense thatf()r®®¥0 as rbut this is not strictly necessary. All the other functions above have the propertyf()r®¥®¥ as rNote that even the Linear Function f()rrp==-xx is still non-linear in thecomponents of x. In one dimension, this leads to a piecewise-linear interpolatingfunction which performs the simplest form of exact interpolation.For neural network mappings, there are good reasons for preferring localised basisfunctions. We shall focus our attention on Gaussian basis functions since, as well asbeing localised, they have a number of other useful analytic properties. We can also seeintuitively how to set their widths s and build up function approximations with them. L12-8Radial Basis Function NetworksYou might think that what we have just described isnÕt really a neural network. And alot of people would agree with you! However, we can see how to make it look like one:Note that the N training patterns { xip, tp } determine the weights directly. The hiddenlayer to output weights multiply the hidden unit activations in the conventional manner,but the input to hidden layer weights are used in a very different fashion.output y¥ ¥ ¥¥ ¥ ¥¥ ¥ ¥inputs xibasis functions fxp Ð xweights xipweights åqF-1pq tq1D1Np L12-9Problems with Exact InterpolationWe have seen how we can set up RBF networks that perform exact interpolation, butthere are two serious problems with these exact interpolation networks:1. They perform poorly with noisy dataAs we have already seen for Multi-Layer Perceptrons (MLPs), we do not usually wantthe networkÕs outputs to pass through all the data points when the data is noisy, becausethat will be a highly oscillatory function that will not provide good generalization.2. They are not computationally efficientThe network requires one hidden unit (i.e. one basis function) for each training datapattern, and so for large data sets the network will become very costly to evaluate.With MLPs we can improve generalization by using more training data Ð the oppositehappens in RBF networks, and they take longer to compute as well. L12-10Improving RBF NetworksWe can take the basic structure of the RBF networks that perform exact interpolationand improve upon them in a number of ways:1. The number M of basis functions (hidden units) need not equal the number Nof training data points. In general it is better to have M much less than N.2. The centres of the basis functions do not need to be defined as the training datainput vectors. They can instead be determined by a training algorithm.3.The basis functions need not all have the same width parameter s. These canalso be determined by a training algorithm.We can introduce bias parameters into the linear sum of activations at theoutput layer. These will compensate for the difference between the averagevalue over the data set of the basis function activations and the correspondingaverage value of the targets.Of course, these will make analysing and optimising the network much more difficult. L12-11The Improved RBF NetworkWhen these changes are made to the exact interpolation formula, and we allow thepossibility of more than one output unit, we arrive at the RBF network mappingywwkkkjjjM()xx=+()=å01fwhich we can simplify by introducing an extra basis function f01= to giveywkkjjjM()xx=()=åf0For the case of Gaussian basis functions we havefsjjj()expxx=--æçö÷mm222in which we have M´D basis centres {}mmj and M widths {}sj. Next lecture we shall seehow to determine all the network parameters {,},wkjjj mm s. L12-12Overview and Reading1. We began by outlining the basic properties of RBF networks. We then looked at the idea of exact interpolation using RBFs, and wentthrough a number of common RBFs and their important properties. We then saw how to set up an RBF network for exact interpolation andnoted two serious problems with it.4. We ended by formulating a more useful form of RBF network.1. Bishop: Sections 5.1, 5.2, 5.3, 5.4 Haykin: Sections 5.1, 5.2, 5.3, 5.4 Gurney: Section 10.4 Callan: Section 2.65. Hertz, Krogh & Palmer: Section 9.7