Tensor identities not working. Bug or feature?

Florian_Dietz · May 25, 2019, 5:47pm

I want to compare two tensors by identity / by reference. Normally, comparing two objects in python by identity works with ‘is’ or by comparing id() values. However, this does not seem to work:

a = torch.tensor([1,2,3])
b = a.detach()
print(a.data is b.data) # False
print(id(a.data) == id(b.data)) # True

Shouldn’t both of these statements be True?
It looks like I can make a reference comparison with id()==id(), but I am reluctant to trust that this will always work because to my knowledge using ‘is’ should be the way to do it, and that isn’t working.

Can anyone clarify if that is a bug or a feature? How are you supposed to compare two tensors by reference?

tom · May 25, 2019, 7:11pm

Sorry, but what you’re doing is completely bogus.

a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5])
print(id(a.data)==id(b.data))

gives True, too!

What happens is that a.data and b.data don’t simultaneously exist, but are newly allocated and immediately disposed of objects, and share the same memory address (see documentation of id).

a and b will be distinct Tensors using to the same memory area, a.data and b.data are completely distinct (also from a and b) tensors, too.
I will add "don’t use .data" – see the PyTorch 0.4 migration guide.

Best regards

Thomas

Florian_Dietz · May 25, 2019, 7:43pm

I see. Thanks for the clarification.

Is there any way to do what I need, though:

Is there a way to compare a and b from the example that will evaluate to True even though b is the detached version of a?

tom · May 26, 2019, 8:25pm

It is not possible to do this exactly.

Well, you apparently don’t want (a == b).all().item().

What you could do is to see whether they point to the exact same memory range - i.e. .data_ptr(), .storage_offset(), .shape and .strides() are the same.

I’m not sure I recommend that. My impression is that you’re trying to do something that will end up not working well.

Best regards

Thomas

Florian_Dietz · May 26, 2019, 9:36pm

I don’t want to use (a == b).all().item() because it’s possible that two tensors contain the same values by accident, especially since a few tensors get initialized to be all-zeros.

I ended up just storing the original vector along with the detached version in any place where I need to make comparisons. It’s a bit rendundant, but at least this is guaranteed to work.