Introduction to Radial Basis Function Network Mark J L Orr April 1996 Subset selection subsets in a set of size M Two method Forward selection Backward elimination Terminal condition ID: 600387
Download Presentation The PPT/PDF document "Forward Selection" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Forward Selection
Introduction to Radial Basis Function
Network
Mark J. L. Orr April 1996Slide2
Subset selection
subsets in a set of size
M
Two methodForward selectionBackward eliminationTerminal conditionsome chosen criterion, such as GCV
Slide3
Subset selection
In RBF
networks
→ searches in a discrete space of subsets of a set of hidden units with fixed centres and sizes and tries to find the subset with the lowest prediction errorAdvantages of forward selectionThere is no need to fix the number of hidden units in advanceThe model selection criteria are tractableThe computational requirements are relatively lowSlide4
Forward Selection
Formula
Projection matrix
Sum-squared-error Training error, , will never increase as extra functions are addedGCV will eventually stop decreasing and start to increase as overfit sets in
Slide5
Orthogonal Least Squares
Each
new column added to
the design matrix of the growing subset is orthogonal to all previous columnsThis simplifies the equation for the change in sum-squared-error and results in a more efficient algorithmSlide6
Orthogonal Least Squares
Formula
Design matrix
Sum-squared-errorSlide7
Orthogonal Least Squares
Formula
The new
orthogonalised full design matrix Upper triangular matrixOrthogonalised optimal weight vectorUnorthogonalised optimal weightSlide8
Regularised Forward Selection
Combined
forward
selection and standard ridge regressionAdvantage of ridge regressionRegularisation parameter can be optimised in between the addition of new basis functionsSlide9
Regularised Forward Selection
Formula
Projection matrix
Sum-squared-errorCost functionSlide10
Regularised Orthogonal Least Squares
Formula
Sum-squared-error
Cost functionj-th component of the orthogonalised weight vectorUnnormalised weight vectorSlide11
Example
a set of p = 50 training
examples sampled with Gaussian noise of standard
deviation = 1 from the logistic function
Slide12
Example
plain vanilla forward selection
regularised
forward
selection