Hand-written character recognition - PowerPoint Presentation

472 views
Uploaded On 2015-12-08

Hand-written character recognition - PPT Presentation

MNIST a data set of handwritten digits 60000 training samples 10000 test samples Each sample consists of 28 x 28 784 pixels Various techniques have been tried Linear classifier 120 ID: 218788

method bfgs matrix error bfgs method error matrix 300 hessian optimization broyden fletcher goldfarb shanno gradient averting net risk

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/218788" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "Hand-written character recognition" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Hand-written character recognition

MNIST: a data set of hand-written digits60,000 training samples10,000 test samplesEach sample consists of 28 x 28 = 784 pixelsVarious techniques have been tried Linear classifier: 12.0%2-layer BP net (300 hidden nodes) 4.7%3-layer BP net (300+200 hidden nodes) 3.05%Support vector machine (SVM) 1.4%Convolutional net 0.4%6 layer BP net (7500 hidden nodes): 0.35%

Failure rate for

test samplesSlide2

Hand-written character recognition

Our own experiment:BP learning with 784-300-10 architectureTotal # of weights: 784*300+300*10 = 238,200Total # of Δw computed for each epoch: 1.4*10^10Ran 1 month before it stoppedTest error rate: 5.0%Slide3
Slide4

Risk-Averting Error Function

Mean Squared Error (MSE) Risk-Averting Error (RAE)James Ting-Ho Lo. Convexification for data fitting. Journal of Global Optimization, 46(2):307–315, February 2010.Slide5

Normalized Risk-Averting Error Normalized Risk-Averting Error (NRAE)

It can be simplified asSlide6

The Broyden-Fletcher

-Goldfarb-Shanno (BFGS) MethodA quasi-Newton method for solving the nonlinear optimization problemsUsing first-order gradient information to generate an approximation to the Hessian (second-order gradient) matrixAvoiding the calculation of the exact Hessian matrix can significantly save the computational cost during the optimizationSlide7

The BFGS Algorithm:Generate an initial guess and an initial approximate inverse Hessian Matrix .

Obtain a search direction at step k by solving: where is the gradient of the objective function evaluated at .Perform a line search to find an acceptable stepsize in the direction , then update

The

Broyden

-Fletcher

Goldfarb-

Shanno

(

BFGS

) MethodSlide8

Set and .

Update the approximate Hessian matrix byRepeat step 2-5 until converges to the solution. Convergence can be checked by observing the norm of the gradient, . The Broyden-Fletcher-Goldfarb-

Shanno

(

BFGS

) MethodSlide9

Limited-memory BFGS Method:A variation of the BFGS method

Only using a few vectors to represent the approximation of the Hessian matrix implicitlyLess memory requirementWell suited for optimization problems with a large number of variables The Broyden-Fletcher-Goldfarb-Shanno (BFGS) MethodSlide10

References

J. T. Lo and D. Bassu. An adaptive method of training multilayer perceptrons. In Proceedings of the 2001 International Joint Conference on Neural Networks, volume 3, pages 2013–2018, July 2001.James Ting-Ho Lo. Convexification for data fitting. Journal of Global Optimization, 46(2):307–315, February 2010.BFGS: http://en.wikipedia.org/wiki/BFGSSlide11

A Notch FunctionSlide12

MSE vs. RAESlide13

MSE vs. RAE