Indexing with a tuple aliases, but with a python list copies

drevicko · August 27, 2019, 5:18am

I noticed what I find to be surprising behaviour: if I index a tensor with a python tuple, I get an alias of the indexed element, but if I index with a python list, I get a copy:

t = torch.rand(3,5)
print(t[1,2].data_ptr())
idx = (1,2)
print(t[idx].data_ptr())
idx = [1,2]
print(t[idx].data_ptr())

Output:

94484139998412
94484139998412 
94484140672144

Is this expected behaviour? I gather that indexing with a tuple is identical to directly indexing an element, so returning a reference/alias makes sense.

Am I to understand that a python list is converted to a tensor, and that tensor indexing with a non-zero-dim tensor results in a copy? If so, this discussion around indexing with zero-dim tensors seems relevant here.

There’s also this post on a possibly related confusion.

andreaskoepf · August 27, 2019, 8:26am

By using a list to index a tensor you can select elements of a dimension in arbitrary order. Due to the internal representation of a tensor over its underlying continuous storage it is in general not possible to create a new tensor that is just a view into the old one and shares its storage object. In contrast by using a tuple you can specify only single elements along each dimension - each tuple entry is responsible for a different dimension.

e.g. lets say you want to index the diagonal elements in reverse:

>>> x = torch.rand(3,3)
>>> x
tensor([[0.1899, 0.9408, 0.0889],
        [0.4863, 0.5366, 0.1633],
        [0.8910, 0.4463, 0.2007]])
>>> x[[-1,-2,-3],[-1,-2,-3]]
tensor([0.2007, 0.5366, 0.1899])

drevicko · August 28, 2019, 7:37am

Oh, I get it. A tuple indexes an element, a list provides multiple indexes in the first dimension of t.

In [31]: t = torch.rand(3,5) 
    ...: print(t[1,2])
    ...: print(t[1,2].data_ptr()) 
tensor(0.1442)                                                                                                                                                                                                          
140485274866972

In [32]: idx = (1,2) 
    ...: print(t[idx]) 
    ...: print(t[idx].data_ptr()) 
tensor(0.1442)
140485274866972
In [33]: idx = [1,2] 
    ...: print(t[idx])   
    ...: print(t[idx].data_ptr()) 
tensor([[0.8098, 0.4710, 0.1442, 0.5391, 0.1699],
        [0.7292, 0.6585, 0.7074, 0.4800, 0.6104]])
140485233816192