Differentiating with respect to input: tensor does not require grad and does not have a grad_fn, yet requires_grad is set to True

geodavic · March 24, 2021, 1:47pm

I want to differentiate the output of a model with respect to its input. Here is the code (self is a class similar to nn.Module, x is the input, which is a batch of images)

lsm = torch.nn.LogSoftmax(dim=1)
x.requires_grad = True
self.eval()

logits = self.evaluate(x)
out_lsm = lsm(logits)

aux_sum = torch.sum(torch.max(out_lsm, axis=1)[0])
aux_sum.backward()

I want to find the gradient of aux_sum with respect to x , but for some reason aux_sum has no gradient function and I get the error element 0 of tensors does not require grad and does not have a grad_fn

As a sanity check, I can do the same process but with a simpler “model”:

x = torch.Tensor([0, 1, 2])
x.requires_grad = True

out_lsm = x ** 2

aux_sum = out_lsm.sum()
aux_sum.backward()
print(x.grad)
>>> tensor([0., 2., 4.])

with no issues. Does anyone know what is causing the gradient issue in the first case, and why it’s not the same as the simplified second case? Thank you!

suraj.pt · March 24, 2021, 2:43pm

Check if out_lsm.requires_grad == True. If not, maybe there is some operation in self.evaluate that is non differentiable that is causing requires_grad to be False

You can refer to this FAQ to know more about None gradients Why are my tensor's gradients unexpectedly None or not None?

geodavic · March 24, 2021, 3:23pm

Thank you. The issue was actually a torch.no_grad hiding in self.evaluate that I didn’t account for.