In this article, there is one explanation about backpropagation happens with a computation graph.
def backward (incoming_gradients):
self.Tensor.grad = incoming_gradients
for inp in self.inputs:
if inp.grad_fn is not None:
new_incoming_gradients = //
incoming_gradient * local_grad(self.Tensor, inp)
inp.grad_fn.backward(new_incoming_gradients)
else:
pass
However, here is the question.
import torch
x = torch.randn((3, 3), requires_grad=True)
w = torch.randn((3, 3), requires_grad=True)
t = w * x
loss = 10 - t.sum()
loss.backward()
print(f"{x.grad_fn=}, {x.requires_grad=}, {x.grad=}, {x.is_leaf=}")
# x.grad_fn=None, x.requires_grad=True, x.grad=tensor([[-0.1528, -0.5653, 0.0930],
# [ 0.6547, -1.1224, 1.1258],
# [-0.2397, -0.4652, -1.7134]]), x.is_leaf=True
When inp = x
, then because x.grad_fn
is None, so the code would straightly go to else : pass
.
Then x.grad
should rationally be None anyway, since the code didn’t do the computation.
But when we print it out, as we can see above, x.grad
is not None.(see x.grad=tensor([[-0.1528, -0...
)
So does this mean the article I showed is wrong?
By the way, I’m curious about under which conditions a tensor’s grad_fn
would be None?
Would a tensor’s grad_fn
set to None
only if its ‘is_leaf’ is True
?