[Solved, Bug] Inconsistent behavior for Tensor list indexing

Hi, I’m dealing with a weird problem which i cannot understand: to be short, i don’t understand why some lines of the minimal example below throws an IndexError: too many indices for tensor of dimension 2:

N = 1000
D = 42
t = torch.randn(N, D) # the dataset (time series data)
W = 10
# these indices create windows to feed to e.g. an LSTM
indices =  [ range(i, i + W) for i in range(0, N - W) ]

windowed_data = t[indices]           # this works
windowed_data = t[indices[:100] ] # this works too
windowed_data = t[indices[:10] ]   # IndexError, wtf?

windowed_data = t[indices[:32] ]   # this works
windowed_data = t[indices[:31] ]   # IndexError

The above code shows that this list indexing i’m doing works well if the indices list is “long enough”, which seems to be above 32 of length, and throws an IndexError below 31.

Context: I am storing a time series dataset composed of N instances of dimensionality D, as a tensor of shape (N,D), and i wrote code to “windowize” it into overlapping windows of W instances, to get it to a shape of (N,W,D).

I was doing some debugging in other parts of the code and stumbled upon this problem, which made me think that maybe i don’t understand tensor list indexing too wee.

Can someone with more experience explain why this behavior appears?

1 Like

Looks like a good reason for raising an issue
BTW everything works after converting indices[:xx] to torch.tensor()

i think you’re indexing the second dimension

I am not sure of the reason why it doesn’t work.
But the below code works.

windowed_data = t[indices[:10] , :]
windowed_data = t[indices[:31] , :]

this is an awesome issue :thinking: :thinking:

Definitely the answer has its roots in the C memory buffer where the tensor is stored. The translation of the python slicing indices to C indices can also be an issue

Ideally it should keep giving error beyond 2 as the base tensor does not have any axes for memory referencing

Thanks @InnovArul and @my3bikaht ! Every test works perfectly if i use torch.tensor(indices) or use the more precise notation of t[indices[:10], :].

I’ll mentally archive this in my “weird things” section, i thought that there was going on something specific that i didn’t know of, thanks everyone for your help and participation!

Would you mind creating a GitHub issue for this error, as it seems to be unexpected given that another syntax seems to work fine?

I just opened a GitHub issue (Inconsistent behavior when indexing a Tensor with a list of lists · Issue #119548 · pytorch/pytorch · GitHub), since the described inconsistent behaviour still remains.