Yujia Bao Mar 7 2017 Finite Difference Let be any differentiable function we can approximate its derivative by f or some very small number How to compare the numerical gradient with ID: 542624
Download Presentation The PPT/PDF document "Gradient Checks for ANN" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Gradient Checks for ANN
Yujia Bao
Mar 7, 2017Slide2
Finite Difference
Let
be any differentiable function, we can approximate its derivative by
for some very small number .
Slide3
How to compare the numerical gradient
with
the analytic gradient?Slide4
Relative Error
Let
be the numerical gradient calculated using finite difference, and
be the analytic gradient calculated using back prop. Define the relative error
Slide5
Relative Error
usually means the analytic gradient is wrong.
is fine for sigmoid activation (including logistic,
tanh, softmax). But if you are using (leaky) ReLU, then might be too large. means your analytic gradient is correct. Slide6
Debugging Procedure
Goal: Check the gradient for a single weight
is computed correctly.
Given: One example (Input features with a label)Forward prop and Backward prop to get the gradient for .Let (I usually choose ).Forward prop to get the output, and then compute the loss.Let (Now is ).Forward prop to get the output, and then compute the loss
.
Check the relative
error.
R
ecover the origin weight by
.
Slide7
Debugging Procedure
Suppose our network has the following structure:
Input -> Conv1 -> Pool1 -> Conv2 -> Pool2 ->
ReLU -> OutputIf the gradients from Input to Conv1 are correct, then we are done!Otherwise, we check the gradients from Pool1 to Conv2 (since there is no weights from Conv1 to Pool1). If it is correct, then this means there are some bugs in our Back prop code from Pool1 to Input.And so on…Slide8
Thanks.