Mental model for tensor.stride()

I’m trying to understand tensor.stride().

Let’s compare three tensors with the same underlying data: [0, 1, 2].

Making a tensor T of shape (3,1) (i.e. a ‘column vector’ x a singleton dimension) from that data will have stride=(1,3). This makes sense because moving from component T[i,j] to T[i+1,j] corresponds to ‘going 1 forward’ in the underlying data [0,1,2]. Similarly, moving from T[i,j] to T[i,j+1] means ‘going 3 forward’, when we start at the beginning again after reaching the end of the underlying data.

For a tensor of shape (1,3) (‘row vector’ x singleton dimension), we get stride=(3,1). This also makes sense because now moving from T[i,j] to T[i+1,j] means ‘going 3 forward’ in the underlying data, and analogously ‘going 1 forward’ for the second dimension.

However, I don’t understand the case for shape (3,1,1) (‘column vector’ x singleton dimension x singleton dimension). By the above reasoning, the stride should look like this: stride=(1,3,3) because moving from T[i,j,k] to T[i,j+1,k] would require ‘going 3 forward’ in the data [0,1,2]. However, the actual stride is [1,1,1]. How does this make sense?

I might have made a mistake in converting the numpy arrays to tensors before querying the stride, because now I get stride=(1,1) for the shape=(1,3) case, which makes all the above examples consistent with each other if one sets the stride to 1 for every singleton dimension. The only thing I don’t really understand then is why for a singleton dimension the stride should be 1 instead of 0. Moving 1*8 forward in storage brings me to the next component in one of the other dimensions, instead of remaining where I am (because the singleton dimension has only size 1).

I’ve tried to write a general explanation of tensor shapes and strides in this post so you might want to take a look at it.

A stride of 0 would indicate that you are not “moving” and would be reading the same value from memory. This is the case, if you expand a tensor, so increase the size without copying data:

x = torch.tensor([1])
y = x.expand(10)
print(y.size(), y.stride())
> torch.Size([10]) (0,)
y = y.contiguous() # trigger a copy
print(y.size(), y.stride())
> torch.Size([10]) (1,)