Updated weights are not leaf tensor?

Gary_Tze_Hay_Lau · December 24, 2024, 1:59pm

Answer 1:

The initial weight (created by the user, typically via torch.nn.Parameter) is considered a leaf tensor if it has requires_grad=True. This is because it is directly created by the user and not the result of an operation.

Updated weights (after an operation, such as applying gradients during backpropagation) are not leaf tensors. These updated weights are the result of operations (like adding the gradients to the previous weights), and therefore they have a grad_fn that points to the operation used to create them. Hence, they are non-leaf tensors.

So, only the initial weights (before training) are leaf tensors with grad_fn=None, while the updated weights are the result of a computation (e.g., weight update using gradients) and thus are not leaf nodes.

Answer 2:
Here, weights is a leaf tensor, and after the update, new_weights is a new tensor that results from an operation on weights. Despite being created through an operation, new_weights is still a leaf tensor because it’s a direct result of your manual creation (the subtraction operation), not an operation involving tensors that would produce a non-leaf tensor.

============
Is it correct?

Is the updated weight considered a leaf node in pytorch or not?
Could anyone help me Thanks.

There are two contradictory explanations after I use ChatGPT to give me an answer…

ptrblck · December 24, 2024, 2:09pm

ChatGPT is wrong in both cases. Parameters are and stay leaf tensors as all updates are performed in-place on them.