This error is raised if the model output or loss has been detached from the computation graph e.g. via:
- using another library such as numpy
- using non-differentiable operations such as
torch.argmax
- explicitly detaching the tensor via
tensor = tensor.detach()
- rewrapping the tensor via
x = torch.tensor(x)
or if the gradient calculation was disabled in the current context or globally such that no computation graph was created at all.
To debug this issue, check the .grad_fn
attribute of the loss, model output, and then the intermediate activations created in the forward
method of your model and make sure they are returning a valid function name. If None
is returned it means that this tensor is not attached to any computation graph.