Confusion about https://pytorch.org/docs/stable/autograd.html

milan_kalkenings · October 5, 2022, 12:37pm

I think I found a mistake on Automatic differentiation package - torch.autograd — PyTorch 2.1 documentation.
The explanations of torch.Tensor.is_leaf & torch.Tensor.detach_ imply that a Tensor with requires_grad = False is a leaf, but exactly the opposite is the case, since the gradient is calculated w.r.t. all leaves. Am I wrong?

Kind Regards

srishti-git1110 · October 5, 2022, 1:22pm

Hi Milan,
A simple explanation -

A leaf tensor is any tensor on which no operations (that autograd can track) have been performed.
When a tensor is first initialized, it becomes a leaf node/tensor.
Basically, all inputs and weights of a neural network are leaf tensors and hence leaf nodes in the computational graph.
Whenever any operation (that autograd can track - this is important) is performed on a leaf tensor, it does not remain a leaf anymore.

See:

x = torch.tensor([1.0, 2.0])
print(x.is_leaf) # True
print(x.requires_grad) # False

y = (x*2).sum()
print(y.is_leaf) # True
print(y.requires_grad) # False

Although y is the result of some operations on x, y is still a leaf as autograd isn’t tracking anything.

Not necessarily. It has less to do with the requires_grad attribute and more to do with what autograd tracks.
As you saw above, exactly the opposite is not always the case. A tensor with requires_grad = False can be a leaf.

detach simple detaches a tensor from its current computation graph and returns a new tensor which is a leaf and has its requires_grad = False.

detach_ is just that but does it in-place without returning a new tensor.

Does this answer your question?

milan_kalkenings · October 5, 2022, 1:45pm

Thank you for clarification!
I have a follow-up question:
why is it, that

a = torch.tensor([1., 2.], requires_grad=True)
b = 2 * a
c = a.sum()
d = (2*a).sum()
print(a.is_leaf)  # True
print(b.is_leaf)  # False
print(c.is_leaf)  # False
print(d.is_leaf)  # False

# ,but
a = torch.tensor([1., 2.], requires_grad=False)
b = 2 * a
c = a.sum()
d = (2*a).sum()
print(a.is_leaf)  # True
print(b.is_leaf)  # True
print(c.is_leaf)  # True
print(d.is_leaf)  # True

srishti-git1110 · October 5, 2022, 1:49pm

In the first snippet, since a has requires_grad=True, the autograd engine is now tracking every operation involving a.

Hence, all calculations involving b, c, d are being tracked which is why they aren’t leaves any more. See:

For the second code, a’s requires_grad=False and autograd isn’t tracking, so all b, c, d are leaves - this is conceptually exactly what the code sample in my previous reply is.

Let me know if you have further questions,
Srishti

milan_kalkenings · October 5, 2022, 1:51pm

Thanks, you helped me a lot!