# Cannot compute Hessian Vector Product of `nn.Module`

Hi, I’m trying to get a hessian vector product of a network (for TRPO). But the following codes don’t work as expected. Do I miss something important? Does anybody know how to solve it? Thank you in advance.

## Environment

• Mac and Ubuntu
• Python 3.6
• PyTorch 0.4.0

## Conv case

``````conv = nn.Conv2d(3, 64, 3)
input = torch.randn(1, 3, 32, 32)
out = conv(input).sum()
flatten = torch.cat([g.reshape(-1) for g in grads if g is not None])
x = torch.randn_like(flatten)
print(flatten.shape) ## torch.Size([1792])
hvps = autograd.grad([flatten @ x], conv.parameters(), allow_unused=True)
print(hvps[1]) ## None
flatten2 = torch.cat([g.reshape(-1) for g in hvps if g is not None])
print(flatten2.shape) ## torch.Size([1728])
``````

In this case, the gradients of `conv.bias` are `None`.

## Linear case

``````linear = nn.Linear(10, 20)
input = torch.randn(1, 10)
out = linear(input).sum()
flatten = torch.cat([g.reshape(-1) for g in grads if g is not None])
x = torch.randn_like(flatten)
print(flatten.shape)
hvps = autograd.grad([flatten @ x], linear.parameters(), allow_unused=True)
``````

Here, I got the following message.

``````Traceback (most recent call last):
File "fvp.py", line 24, in <module>
hvps = autograd.grad([flatten @ x], linear.parameters(), allow_unused=True)
inputs, allow_unused)
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
``````
1 Like

We found ReLu solves this issue. ReLu is all we need.

``````conv = nn.Sequential(nn.Conv2d(3, 64, 3),
nn.ReLU())
``````

Hi @moskomule,

I keep getting (none,none) as an hvps. What is autograd.grad() supposed to return? Following is the (slightly modified) code:

from torch import autograd
linear = nn.Linear(10, 20)
x = torch.randn(1, 10)
out = linear(x).sum()
flatten = torch.cat([g.reshape(-1) for g in grads if g is not None])
x = torch.randn_like(flatten)

p=(flatten @ x)
print(hvps)

output is:
===>
torch.Size([220]) False
(None, None)

** without allow_unused=True, I get err: One of the differentiated Tensors appears to not have been used in the graph. Any idea what this means: which graph is it referring to?

I don’t understand the solution; why adding relu helps here? Can you help?

Thanks,

As you can see in https://pytorch.org/docs/master/autograd.html, the arguments of `grad`

`torch.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True, allow_unused=False)`

`outputs` and `inputs` are expected to be sequences of tensors. But you use just a tensor as `outputs`.

Thanks @moskomule for the help! I solved the issue.

1 Like