Differentiating with respect to input: tensor does not require grad and does not have a grad_fn, yet requires_grad is set to True

I want to differentiate the output of a model with respect to its input. Here is the code (self is a class similar to nn.Module, x is the input, which is a batch of images)

lsm = torch.nn.LogSoftmax(dim=1)
x.requires_grad = True

logits = self.evaluate(x)
out_lsm = lsm(logits)

aux_sum = torch.sum(torch.max(out_lsm, axis=1)[0])

I want to find the gradient of aux_sum with respect to x , but for some reason aux_sum has no gradient function and I get the error element 0 of tensors does not require grad and does not have a grad_fn

As a sanity check, I can do the same process but with a simpler “model”:

x = torch.Tensor([0, 1, 2])
x.requires_grad = True

out_lsm = x ** 2

aux_sum = out_lsm.sum()
>>> tensor([0., 2., 4.])

with no issues. Does anyone know what is causing the gradient issue in the first case, and why it’s not the same as the simplified second case? Thank you!

Check if out_lsm.requires_grad == True. If not, maybe there is some operation in self.evaluate that is non differentiable that is causing requires_grad to be False

You can refer to this FAQ to know more about None gradients Why are my tensor's gradients unexpectedly None or not None?

Thank you. The issue was actually a torch.no_grad hiding in self.evaluate that I didn’t account for.