Boolean Indexing

samuel · March 12, 2019, 12:48pm

I found a behavior that I could not completely explain in boolean indexing. While it works fine with a tensor

>>> a = torch.tensor([[1,2],[3,4]])
>>> a[torch.tensor([[True,False],[False,True]])]
tensor([1, 4])

It does not work with a list of booleans

>>> a[[[True,False],[False,True]]]
tensor([3, 2])

My best guess is that in the second case the bools are cast to long and treated as indexes. Is this behavior wanted or a bug?

albanD · March 12, 2019, 1:06pm

Hi,

If you pass a bool tensor, it is interpretet as a mask and will return the entries where True is given.
Isn’t it interpretting the list of bools as a list of int with False=0 and True=1? Wrapping this in a torch.ByteTensor() will recover the mask behavior.

imaluengo · March 12, 2019, 1:58pm

I think pytorch here is following same numpy behaviour, as @albanD mentioned:

1- When a boolean tensor / array is passed to index, it will perform a mask behaviour.

2- Both in pytorch and numpy, when providing a Python List it will assume as coordinates to grab:

>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> a[[[1, 0], [0, 1]]]
array([3, 2])

Which reads as:
1- From row 1 pick column 0
2- From row 0 pick column 1

The only thing that differs from pytorch and numpy here is that pytorch is interpereting True = 1 and False = 0 (as suggested by @albanD)

samuel · March 12, 2019, 2:15pm

Yes, this is also how I understood it. But this is a problem isn’t it? Isn’t it very misleading to interpret a list of bool not as a mask differently but a tensor of uint8?

imaluengo · March 12, 2019, 2:30pm

Yup I agree, I would also find it unexpected behaviour. I was not trying to justify but trying to give an explanation.

Whether is the correct behaviour or not is up to Pytorch Devs