/
Gradient Checks for ANN Gradient Checks for ANN

Gradient Checks for ANN - PowerPoint Presentation

liane-varnes
liane-varnes . @liane-varnes
Follow
395 views
Uploaded On 2017-04-28

Gradient Checks for ANN - PPT Presentation

Yujia Bao Mar 7 2017 Finite Difference Let be any differentiable function we can approximate its derivative by f or some very small number   How to compare the numerical gradient with ID: 542624

prop gradient input analytic gradient prop analytic input error pool1 relative output check means correct conv1 debugging weight calculated

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Gradient Checks for ANN" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Gradient Checks for ANN

Yujia Bao

Mar 7, 2017Slide2

Finite Difference

Let

be any differentiable function, we can approximate its derivative by

for some very small number .

 Slide3

How to compare the numerical gradient

with

the analytic gradient?Slide4

Relative Error

Let

be the numerical gradient calculated using finite difference, and

be the analytic gradient calculated using back prop. Define the relative error

 Slide5

Relative Error

usually means the analytic gradient is wrong.

is fine for sigmoid activation (including logistic,

tanh, softmax). But if you are using (leaky) ReLU, then might be too large. means your analytic gradient is correct. Slide6

Debugging Procedure

Goal: Check the gradient for a single weight

is computed correctly.

Given: One example (Input features with a label)Forward prop and Backward prop to get the gradient for .Let (I usually choose ).Forward prop to get the output, and then compute the loss.Let (Now is ).Forward prop to get the output, and then compute the loss

.

Check the relative

error.

R

ecover the origin weight by

.

 Slide7

Debugging Procedure

Suppose our network has the following structure:

Input -> Conv1 -> Pool1 -> Conv2 -> Pool2 ->

ReLU -> OutputIf the gradients from Input to Conv1 are correct, then we are done!Otherwise, we check the gradients from Pool1 to Conv2 (since there is no weights from Conv1 to Pool1). If it is correct, then this means there are some bugs in our Back prop code from Pool1 to Input.And so on…Slide8

Thanks.