Issues computing Hessian-vector product

I think issue could best be described by giving a simple example. In the following simple script, I’m trying to take the Hessian-vector product where the Hessian is of f_of_theta taken w.r.t. theta and the vector is simply vector.

import torch
from torch.autograd import Variable, grad

theta = Variable(torch.randn(2,2), requires_grad=True)
f_of_theta = torch.sum(theta ** 2 + theta)
vector = Variable(torch.randn(2,2))

gradient = grad(f_of_theta, theta)[0]
gradient_vector_product = torch.sum(gradient * vector)
gradient_vector_product.requires_grad = True
hessian_vector_product = grad(gradient_vector_product, theta)[0]

gradient is being calculated correctly but when the script tries to calculated hessian_vector_product, I get the following error:

terminate called after throwing an instance of ‘std::runtime_error’
what(): differentiated input is unreachable
Aborted

So, simply put, my question is how exactly should I do what I’m trying to do? Any help with this would be greatly appreciated.

Edit: Note that I’m using the pytorch version built from the latest commit on master (ff0ff33).

There’s a Hessian-vector product example in the autograd tests: https://github.com/pytorch/pytorch/blob/master/test/test_autograd.py#L151

2 Likes

Yeah, I was looking at that before but I couldn’t get it to work by doing something that I thought was analogous. Turns out my hang-up was that it didn’t occur to me that it was important to pass in a Variable to the first backward pass and not a Tensor.

For the sake of anyone else who may read this in the future, it appears that the following is what is needed to get my simple example working:

import torch
from torch.autograd import Variable, grad

theta = Variable(torch.randn(2,2), requires_grad=True)
f_of_theta = torch.sum(theta ** 2 + theta)
vector = Variable(torch.randn(2,2))

f_of_theta.backward(Variable(torch.ones(2,2), requires_grad=True), retain_variables=True)
gradient = theta.grad
gradient_vector_product = torch.sum(gradient * vector)
gradient_vector_product.backward(torch.ones(2,2))
hessian_vector_product = theta.grad - gradient

1 Like

You don’t need to pass in a Variable nor specify retain_variables. This would be enough:

f_of_theta.backward(create_graph=True)
1 Like

When I do:

import torch
from torch.autograd import Variable, grad

theta = Variable(torch.randn(2,2), requires_grad=True)
f_of_theta = torch.sum(theta ** 2 + theta)
vector = Variable(torch.randn(2,2))

f_of_theta.backward(create_graph=True)
gradient = theta.grad
gradient_vector_product = torch.sum(gradient * vector)
gradient_vector_product.backward(torch.ones(2,2))
hessian_vector_product = theta.grad - gradient

I get:

TypeError: backward() got an unexpected keyword argument ‘create_graph’

I’m still on the same PyTorch version as before. Weird. I definitely see the create_graph argument on line 46 of ./torch/autograd/init.py of the source I’m building off of. Not sure what to make of that.

1 Like

Hard to say. Maybe try to pip uninstall torch and build it again?

1 Like

Hello @apaszke,

I’m not sure whether it is relevant, but for me, fn.backward does not take create_graph either, but backward(fn, create_graph=True) works as expected.
This seems to be because right now

does not take create_graph.

Best regards

Thomas

@tom You’re right! I forgot to add new arguments to Variable.backward. Thanks.

@apaszke That was probably my issue too. On a somewhat related note, is there any sense of an ETA on converting all operations to be twice differentiable?

1 Like

https://github.com/mjacar/pytorch-trpo I run example.py in this project and came across several problems: one is unexpected argument create_graph

change

  • dict iteration methods,
  • xrange to range and
  • remove create_graph=True

made the trpo cartpole example work in python3.

But I wanna know what does create_graph do for us? And whether this argument has already been removed? Thanks.

1 Like

Hi, @tigerneil

try uninstalling torch, and rebuilding from source? Then, it should work fine, with create_graph=True

1 Like

create_graph=True is necessary for the proper second derivative approach to work. Right now the bottleneck on that is having all PyTorch operations be twice differentiable. So, in a sense, that parameter is presently useless. For what it’s worth, I plan on continuing work on that repo which will include documentation so there’s no confusion over Python 2 vs Python 3 among numerous other things.

2 Likes

yeah. I rebuild from source. It works now. Thanks.

1 Like

And only 0.2.0 supports that? but not 0.1.2?

Yes. As far as I understand, 0.2 will be released next week or so, but until then you need to compile from source for second derivatives.