I am trying to get the Hessian Matrix of weights in a convolutional kernel. However, there is no API which can do the job like Tensorflow.
TensorFlow will give you the diagonal of the hessian, not the full hessian (if i am not confused).
In current version of PyTorch there is no way to do this, but we will have this feature in version 0.2, the next major release.
AFAIK TensorFlow will return you a Hessian-vector product like most automatic differentiation software.
Any updates regarding second order derivatives in PyTorch?
autograd branch, which will be merged soon, supports repeated application of
.backward (or, more conveniently, a new
autograd.differentiate operator) and can compute the exact Hessian-vector product.
autograd compatible with the
master branch right now? Does
autograd.differentiate support taking the gradient of a high order function of gradient? I dug around but couldn’t find the roadmap of next release.
Has there been any update on this? That is, how to get the Hessian (even if just the diagonal) in Pytorch?
Would something like this work:
output = model.forward(input)
hess = input.grad
if you’re using the
master branch it’s something along these lines:
Thanks, will look into that!
It looks like torch.autograd does not have a “grad” function?
I’m using the latest pytorch (0.1.12_2)
ImportError Traceback (most recent call last)
1 import torch
2 import torch.nn as nn
----> 3 from torch.autograd import Variable, grad
4 import torchvision.transforms as transforms
ImportError: cannot import name grad
It’s present in the
master branch and not in 0.1.12.
You have to build pytorch from source. Instructions are here: https://github.com/pytorch/pytorch#from-source
Got it, thanks!
I installed Pytorch from source and now I have access to the grad and backward functions in torch.autograd.
However when I try to use grad my kernel just crashes. I use like this:
output_img = model.forward(input_img)
g = grad(output_img, input_img, create_graph=True)
Is this not intended to be used with non-scalar (multiple-dimensional) Tensors? The examples on the github work fine.
AFAIK all the operations have not been modified to be twice differentiable yet. I saw a couple of open PRs.
I suspect that is why its crashing.
I see, that makes sense, given that the model is a CNN with skip connections. Thanks!
Are there any updates on this? Its been some time.
Whats wrong with just doing w.grad.backward()?
From pytorch 0.2.0, you can get higher order gradient. More information can be found here: https://github.com/pytorch/pytorch/releases/tag/v0.2.0
import torch from torch import Tensor from torch.autograd import Variable from torch.autograd import grad from torch import nn torch.manual_seed(623) x = Variable(torch.ones(2,1), requires_grad=True) A = torch.FloatTensor([[1,2],[3,4]]) print(A) print(x) f = x.view(-1) @ A @ x print(f) x_1grad, = grad(f, x, create_graph=True) print(x_1grad) print(A @ x + A.t() @ x) x_2grad0, = grad(x_1grad, x, create_graph=True) x_2grad1, = grad(x_1grad, x, create_graph=True) Hessian = torch.cat((x_2grad0, x_2grad1), dim=1) print(Hessian) print(A + A.t())
I’ve worked on this issue for some time, plz feel free to use it below.