Computing Jacobian and Hessian of Loss w.r.to Biases

Jamesswiz · March 6, 2020, 1:21pm

Hi,
what is the simplest way to compute full Jacobian and Hessian of Loss w.r.to neural network biases only.
I want to do this layer wise but considering one layer at a time and I am not worried about speed for now.

Thanks

albanD · March 6, 2020, 3:31pm

Hi,

You can check the implementation here: https://gist.github.com/apaszke/226abdf867c4e9d6698bd198f3b45fb7 where you give only the bias as inputs.

Jamesswiz · March 6, 2020, 3:38pm

This gives me error:
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor []] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

If I do recursive grad(grad), I don’t get the full hessian as we all already know.

albanD · March 6, 2020, 3:40pm

Does replacing this:

grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=create_graph)

by

grad_x, = torch.autograd.grad(flat_y[i], x, retain_graph=True, create_graph=create_graph)

solves the error?

Jamesswiz · March 6, 2020, 3:41pm

No, now I get:
grad can be implicitly created only for scalar outputs

albanD · March 6, 2020, 3:44pm

The y should be a single Tensor right?
So flat_y here should be a 1D Tensor. And flat_y[i] will be a single number. So it is a scalar output.
Is the error coming from this function?

Jamesswiz · March 6, 2020, 3:45pm

Thanks, got it…
It works now.