Backward() does not work

ptrblck · December 22, 2022, 12:24am

No, I don’t think your approach works as intended.
The backward call will not raise an error, since nablaU is differentiable, but since it’s a newly created tensor in the forward method it will never be updated.
Also, since you are detaching nabla_x, the previously used parameters also shouldn’t get any gradients.
Run a single iteration as:

output = model(input)
loss = criterion(output, target)
loss.backward()

and check the .grad attributes of the model’s parameters via:

for name, param in model.named_parameters():
    print("param {}, grad {}".format(name, param.grad))

which should show None (unless I’m missing how the output is still attached to the computation graph).
Do not call optimizer.zero_grad() before the first backward call as it will fill each .grad attribute with zeros (in the default setup).