Numerical Jacobian seems imply that your module is an identity op, but analytical says the output is independent of the input. (assuming that the first matrix is I and the second is 0.)
Are you seeing this from a custom autograd function (i.e. you wrote the backward), or from the built-in autograd operations (i.e. backward is automatically computed)?
The giveaway is the matrix of Zeros on the “analytical” result. All the grads are Zero! I guess this means that somewhere in the computation graph, autograd lost track of the gradients. Possibly by passing through a Variable I wrote that didn’t have gradients in it.
That’s how Im interpreting that result.
So I’m checking all my Variables to make sure the ones that should have gradients do have them.
Have your problem been solved? I am implementing a custom operator according to https://pytorch.org/docs/master/notes/extending.html, and also get zero analytical gradients. I am wondering whether it is the problem of the inputs or the problem of my implementation of the backward function.
Yes, thats a bit confusing too, gradcheck tests to see if requires_grad=True on each input, if it is False, it doesn’t run the check on that input.
Normally during training grad=True this is only required if you are intending to propagate the gradient outside the model, say into embeddings or something like that. That’s the confusing bit
However, it is helpful, because sometimes there are inputs that you don’t actually want to grad-check. So in that case, you can have gradcheck ignore them by setting requires_grad=False.