First question here, I hope it’s the correct place.
I have a chain of nn.Sequential and I’m trying to extract 2 gradients from my model: gradient of the module output wrt the module input and the gradient of the module output wrt x (initial input).
In other words, given:
y1 = f1(x)
y2 = f2(y1)
# and so on
where f_i is an nn.Sequential (or any nn.Module), I’d like to extract dy2/dy1 (module output wrt module input) and dy2/dx (module output wrt x).
You might be looking for torch.autograd.grad :
Feel free to post in the thread if you face any errors.
Thanks for your help. After going over that doc page, I came up with this tiny piece of code to test it out:
model = nn.Sequential(
grads = 
def hook2(module, inputs, outputs):
# import pdb
for module in model.modules():
if isinstance(module, nn.Sequential):
However, I’m getting an error:
RuntimeError: grad can be implicitly created only for scalar outputs
Not sure how to proceed from here.
Hi @alexandrumeterez ,
Yes, that’s expected behaviour when you try to use autograd to calculate gradient of a tensor that isn’t a single number (scalar).
In case you want to be able to calculate gradient of a multi-dim tensor, use this:
inp = torch.tensor([4.0, 3.0, 2.0], requires_grad=True)
x = (inp*2)**3 # x is not a scalar
print(inp.grad) # tensor([384., 216., 96.])
In fact, when
x is scalar,
x.backward() is a shortcut for
Hope this helps,
This is what I was looking for, thank you!