The difference between torch.tensor.data and torch.tensor

auroua · September 26, 2018, 3:14pm

I write the following code:

a = torch.rand(3, 5)
type(a)                      #  torch.Tensor
type(a.data)              #  torch.Tensor
id(a)                          #  4493764504
id(a.data)                 #  4493475488

I don’t understand the difference between Tensor a and Tensor a.data, and when to use a and when to use a.data.
Thanks.

albanD · September 26, 2018, 3:35pm

Hi,

The .data field is an old field that is kept for backward compatibility but should not be used anymore as it’s usage is dangerous and can make computations wrong. You should use .detach() and/or with torch.no_grad() instead now.

mital · April 2, 2019, 7:02pm

I noticed it’s still in this tutorial

_, predicted = torch.max(outputs.data, 1)

Ido_Amos · May 25, 2021, 4:32pm

Hi @albanD , if data should not be used can you please explain how to execute the following while still getting the model with the adjusted weights?
when I run:

i = 1
for w in LinModel.parameters():
    torch.nn.init.eye_(w)
    w.data = w.data*i
    i += 1

the model weights are adjusted properly by a factor of i after the loop ends. if I don’t use data then just the identity init is saved.

albanD · May 25, 2021, 6:04pm

Hi,

You can do either

with torch.no_grad():
  w *= i

Or if you need cannot do it inplace directly:

with torch.no_grad():
  w.copy_(w * i)

bpfrd · February 8, 2023, 3:44pm

if we do an inplace operation on the data property of a leaf variable with requires_grad=true it seems fine. But what is dangerous exactly? is it safe to use it in pixelcnn code (pixelcnn-pytorch/masked_cnn_layer.py at master · axeloh/pixelcnn-pytorch · GitHub)? does it still track the gradients?

soulitzer · February 8, 2023, 8:26pm

Normally if you make in-place updates on a tensor it would update a version tracker that would error out if autograd realizes that the original tensor is needed for gradient computation during backward.

But if you save a tensor for backward, and then mutate its data property in-place, and then backward, you lose the protection of version counter checking and the results would be silently incorrect.