I am trying to print the gradient values of 3 tensors but I find the printed gradients do not match my manual calculation, specifically the gradient of a is swapped with the gradient of c. Check the code below.
import torch
a = torch.tensor(3.0,requires_grad=True)
b = a*2
c = b ** 2
b.retain_grad()
c.retain_grad()
c.backward() # Computes the gradient of current tensor wrt graph leaves.
print(a.grad)
print(b.grad)
print(c.grad)
So you’ve calculated dc/da which is the gradient value for c, and I agree that 24 is the correct answer. Though, when I print c.grad the output is 1 and confusingly the print of a.grad is 24. Do you see where I am confused?
They don’t seem to be correct as in they’re not matching the manual calculation me and you did. Here is what I am getting:
print(a.grad) → tensor(24.)
print(b.grad) → tensor(12.)
print(c.grad) → tensor(1.)
Not sure it is mentioned explicitly, possibly because c.retains_grad(); c.backward() is relatively rare to do (e.g. loss.grad is never populated) so usually it is harder to confuse.
The common case is more like: I have a loss and many parameter, and its obvious that the gradient of the loss wrt each of params is stored in each of param’s grad fields.