Hi everyone,

First question here, I hope it’s the correct place.

I have a chain of nn.Sequential and I’m trying to extract 2 gradients from my model: gradient of the module output wrt the module input and the gradient of the module output wrt x (initial input).

In other words, given:

```
y1 = f1(x)
y2 = f2(y1)
# and so on
```

where f_i is an nn.Sequential (or any nn.Module), I’d like to extract dy2/dy1 (module output wrt module input) and dy2/dx (module output wrt x).

Thanks!

Alex

Hi Alex,

You might be looking for torch.autograd.grad :

https://pytorch.org/docs/stable/generated/torch.autograd.grad.html

Feel free to post in the thread if you face any errors.

Hi Srishi,

Thanks for your help. After going over that doc page, I came up with this tiny piece of code to test it out:

```
model = nn.Sequential(
nn.Sequential(
nn.Linear(12, 14),
nn.Linear(14, 16),
),
nn.Sequential(
nn.Linear(16, 16),
nn.Linear(16, 20),
),
nn.Sequential(
nn.Linear(20, 20),
nn.Linear(20, 20),
),
).to(device)
grads = []
def hook2(module, inputs, outputs):
# import pdb
# pdb.set_trace()
grads.append(torch.autograd.grad(outputs, inputs[0]))
grads.clear()
for module in model.modules():
if isinstance(module, nn.Sequential):
module.register_forward_hook(hook2)
```

However, I’m getting an error:

RuntimeError: grad can be implicitly created only for scalar outputs

Not sure how to proceed from here.

Hi @alexandrumeterez ,

Yes, that’s expected behaviour when you try to use autograd to calculate gradient of a tensor that isn’t a single number (scalar).

In case you want to be able to calculate gradient of a multi-dim tensor, use this:

```
import torch
inp = torch.tensor([4.0, 3.0, 2.0], requires_grad=True)
x = (inp*2)**3 # x is not a scalar
x.backward(torch.ones_like(x))
print(inp.grad) # tensor([384., 216., 96.])
```

In fact, when `x`

is scalar, `x.backward()`

is a shortcut for `x.backward(torch.Tensor([1]))`

.

Hope this helps,

S

1 Like

This is what I was looking for, thank you!