Tensor.to(device) changes is_leaf, causing "can't optimize a non-leaf Tensor"

This snippet demonstrates the problem:

test = torch.zeros((10,10)).requires_grad_(True)
print(test.is_leaf) # True
test = test.to(data.device)
print(test.is_leaf) # False

When optimizing test using Adam later in my code, I get

ValueError: can’t optimize a non-leaf Tensor

Why is this the case? Is this a bug or intended?

Oh, should have read the docs more in detail. There they say that to() creates a copy, so I assume that in this case, the original tensor that is created on the CPU is the parent of the one that is moved to the GPU.

This works:

test = torch.zeros((10,10)).to(data.device).detach().requires_grad_(True)
print(test.is_leaf) # True
1 Like