I am trying to get the Hessian Matrix of weights in a convolutional kernel. However, there is no API which can do the job like Tensorflow.
TensorFlow will give you the diagonal of the hessian, not the full hessian (if i am not confused).
In current version of PyTorch there is no way to do this, but we will have this feature in version 0.2, the next major release.
AFAIK TensorFlow will return you a Hessian-vector product like most automatic differentiation software.
Any updates regarding second order derivatives in PyTorch?
The autograd
branch, which will be merged soon, supports repeated application of .backward
(or, more conveniently, a new autograd.differentiate
operator) and can compute the exact Hessian-vector product.
Is autograd
compatible with the master
branch right now? Does autograd.differentiate
support taking the gradient of a high order function of gradient? I dug around but couldn’t find the roadmap of next release.
Has there been any update on this? That is, how to get the Hessian (even if just the diagonal) in Pytorch?
Would something like this work:
output = model.forward(input)
model.backward().backward()
hess = input.grad
if you’re using the master
branch it’s something along these lines:
(https://github.com/pytorch/pytorch/pull/1016#issuecomment-299919437)
Thanks, will look into that!
It looks like torch.autograd does not have a “grad” function?
I’m using the latest pytorch (0.1.12_2)
ImportError Traceback (most recent call last)
in ()
1 import torch
2 import torch.nn as nn
----> 3 from torch.autograd import Variable, grad
4 import torchvision.transforms as transforms
5
ImportError: cannot import name grad
It’s present in the master
branch and not in 0.1.12.
You have to build pytorch from source. Instructions are here: https://github.com/pytorch/pytorch#from-source
Got it, thanks!
I installed Pytorch from source and now I have access to the grad and backward functions in torch.autograd.
However when I try to use grad my kernel just crashes. I use like this:
output_img = model.forward(input_img)
g = grad(output_img, input_img, create_graph=True)
input_img: 1x3x256x256
output_img: 1x3x256x256
Is this not intended to be used with non-scalar (multiple-dimensional) Tensors? The examples on the github work fine.
Thanks!
AFAIK all the operations have not been modified to be twice differentiable yet. I saw a couple of open PRs.
I suspect that is why its crashing.
I see, that makes sense, given that the model is a CNN with skip connections. Thanks!
Are there any updates on this? Its been some time.
Whats wrong with just doing w.grad.backward()?
From pytorch 0.2.0, you can get higher order gradient. More information can be found here: https://github.com/pytorch/pytorch/releases/tag/v0.2.0
import torch
from torch import Tensor
from torch.autograd import Variable
from torch.autograd import grad
from torch import nn
torch.manual_seed(623)
x = Variable(torch.ones(2,1), requires_grad=True)
A = torch.FloatTensor([[1,2],[3,4]])
print(A)
print(x)
f = x.view(-1) @ A @ x
print(f)
x_1grad, = grad(f, x, create_graph=True)
print(x_1grad)
print(A @ x + A.t() @ x)
x_2grad0, = grad(x_1grad[0], x, create_graph=True)
x_2grad1, = grad(x_1grad[1], x, create_graph=True)
Hessian = torch.cat((x_2grad0, x_2grad1), dim=1)
print(Hessian)
print(A + A.t())
I’ve worked on this issue for some time, plz feel free to use it below.