[Solved, Bug] Inconsistent behavior for Tensor list indexing

erpigna · November 18, 2021, 11:54am

Hi, I’m dealing with a weird problem which i cannot understand: to be short, i don’t understand why some lines of the minimal example below throws an IndexError: too many indices for tensor of dimension 2:

N = 1000
D = 42
t = torch.randn(N, D) # the dataset (time series data)
W = 10
# these indices create windows to feed to e.g. an LSTM
indices =  [ range(i, i + W) for i in range(0, N - W) ]

windowed_data = t[indices]           # this works
windowed_data = t[indices[:100] ] # this works too
windowed_data = t[indices[:10] ]   # IndexError, wtf?

windowed_data = t[indices[:32] ]   # this works
windowed_data = t[indices[:31] ]   # IndexError

The above code shows that this list indexing i’m doing works well if the indices list is “long enough”, which seems to be above 32 of length, and throws an IndexError below 31.

Context: I am storing a time series dataset composed of N instances of dimensionality D, as a tensor of shape (N,D), and i wrote code to “windowize” it into overlapping windows of W instances, to get it to a shape of (N,W,D).

I was doing some debugging in other parts of the code and stumbled upon this problem, which made me think that maybe i don’t understand tensor list indexing too wee.

Can someone with more experience explain why this behavior appears?

my3bikaht · November 18, 2021, 12:24pm

Looks like a good reason for raising an issue
BTW everything works after converting indices[:xx] to torch.tensor()

mMagmer · November 18, 2021, 12:35pm

i think you’re indexing the second dimension

InnovArul · November 18, 2021, 12:44pm

I am not sure of the reason why it doesn’t work.
But the below code works.

windowed_data = t[indices[:10] , :]
windowed_data = t[indices[:31] , :]

anantguptadbl · November 18, 2021, 1:00pm

this is an awesome issue

Definitely the answer has its roots in the C memory buffer where the tensor is stored. The translation of the python slicing indices to C indices can also be an issue

Ideally it should keep giving error beyond 2 as the base tensor does not have any axes for memory referencing

erpigna · November 18, 2021, 3:49pm

Thanks @InnovArul and @my3bikaht ! Every test works perfectly if i use torch.tensor(indices) or use the more precise notation of t[indices[:10], :].

I’ll mentally archive this in my “weird things” section, i thought that there was going on something specific that i didn’t know of, thanks everyone for your help and participation!

ptrblck · November 19, 2021, 5:33am

Would you mind creating a GitHub issue for this error, as it seems to be unexpected given that another syntax seems to work fine?

epibaikas · February 9, 2024, 2:49pm

I just opened a GitHub issue (Inconsistent behavior when indexing a Tensor with a list of lists · Issue #119548 · pytorch/pytorch · GitHub), since the described inconsistent behaviour still remains.