# Contigious vs non-contigious tensor

whats the difference between contigious vs non-contigious tensor and functionality ?

14 Likes

Itâ€™s a flag indicating, if the memory is contiguously stored or not.
Let use an example to see, how we can get a non-contiguous tensor.

``````# Create a tensor of shape [4, 3]
x = torch.arange(12).view(4, 3)
print(x, x.stride())
> tensor([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11]])
> (3, 1)
``````

As you can see, the tensor has the desired shape.
The strides are also interesting in this case. They basically tell us, how many â€śstepsâ€ť we skip in memory to move to the next position along a certain axis.
If we look at the strides, we see, that we would have to skip 3 values to go to the new row, while only 1 to go to the next column. That makes sense so far. The values are stored sequentially in memory, i.e. the memory cells should hold the data as `[0, 1, 2, 3, ..., 11]`.

Now lets transpose the tensor, and have again a look at the strides:

``````y = x.t()
print(y, y.stride())
print(y.is_contiguous())
> tensor([[ 0,  3,  6,  9],
[ 1,  4,  7, 10],
[ 2,  5,  8, 11]])
> (1, 3)
> False
``````

The print statement of the tensor yields the desired transposed view of `x`.
However, the strides are now swapped. In order to go to the next row, we only have to skip 1 value, while 3 to move to the next column.
This makes sense, if we recall the memory layout of the tensor:
`[0, 1, 2, 3, 4, ..., 11]`
In order to move to the next column (e.g. from `0` to `3`, we would have to skip 3 values.
The tensor is thus non-contiguous anymore!

Thatâ€™s not really a problem for us, except, that some operations wonâ€™t work.
E.g. if we try to get a flattened view of `y`, we will run into a `RuntimeError`:

``````try:
y = y.view(-1)
except RuntimeError as e:
print(e)
> invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view().
``````

So letâ€™s call `.contiguous()` before the `view` call:

``````y = y.contiguous()
print(y.stride())
> (4, 1)
y = y.view(-1)
``````

Now the memory layout is contiguous once again (have a look at the strides) and the `view` works just fine.
Iâ€™m not completely sure, but I assume the `contiguous` call copies the memory to make it continuous again.

That being said, continuous arrays are necessary for some vectorized instructions to work. Also generally they should have some performance advantages, as the memory access pattern on modern CPUs will apparently be used in an optimal way, but Iâ€™m really not an expert on this topic, so take these last information with a grain of salt.

154 Likes

@ptrblck thanks you for your clear and concise explanation

1 Like

@ptrblck Thanks for such clear and concise explanation.

1 Like

Thanks @ptrblck Very clear.

Apologies for being picky with the language. But

`The tensor is thus non-contiguous anymore!` is confusing to me.

Did you mean:

`The tensor is thus non-contiguous now!`?

It seems like the crux of your whole answer to there is a lot of value, in my opinion, to clarify & correct the grammar if possible.

Thanks for your help & time.

From what I understand this a more summarized answer:

Contiguous is the term used to indicate that the memory layout of a tensor does not align with its advertised meta-data or shape information.

In my opinion the word contiguous is a confusing/misleading term since in normal contexts it means when memory is not spread around in disconnected blocks (i.e. its â€ścontiguous/connected/continuousâ€ť).

Some operations might need this contiguous property for some reason (most likely efficiency).

Resource:

What would be the difference, if I change `anymore` to `now`?
`is_contiguous()` returned `True` before and returns now `False`, thus I think `anymore` is the right wording.

That might be the source of the confusion. Iâ€™m talking strictly, how `tensor.is_contiguous()` is defined.

1 Like

If thatâ€™s the case I think the right wording is to say say that itâ€™s non-contiguous now. Not sure what anymore would mean. Perhaps you meant â€śnot contiguous anymoreâ€ť I think that makes sense grammatically as far as I understand english.

You mean how itâ€™s defined in pytorch and not in standard computer hardware? Perhaps I have the terms confused, hence my motivation to seek clarification.

I see the confusion now between â€śnon-contiguous anymoreâ€ť and â€śnot contiguous anymoreâ€ť.
Thanks for clarifying

2 Likes

Canâ€™t tell if you are being sarcastic or funny (or both! ) but I think itâ€™s clear now thanks to the discussion. thanks ptrblck! I appreciate the help, your posts across the forum are really helpful

4 Likes

Thatâ€™s a damn good explanation. Thanks

1 Like

So basically contiguous means the stride == width of a row? So that the operations can glide over the dimensions neatly?

maybe this answer helps you: neural network - PyTorch - contiguous() - Stack Overflow for a different explanation. Hope it helps!

A tensor whose values are laid out in the storage starting from the rightmost dimension onward (that is, moving along rows for a 2D tensor) is defined as `contiguous`. Contiguous tensors are convenient because we can visit them efficiently in order without jumping around in the storage (improving data locality improves performance because of the way memory access works on modern CPUs). This advantage of course depends on the way algorithms visit.

Some tensor operations in PyTorch only work on contiguous tensors, such as `view`, [â€¦]. In that case, PyTorch will throw an informative exception and require us to call contiguous explicitly. Itâ€™s worth noting that calling `contiguous` will do nothing (and will not hurt performance) if the tensor is already contiguous.

Note this is a more specific meaning than the general use of the word â€ścontiguousâ€ť in computer science (i.e. contiguous and ordered).

e.g given a tensor:

``````[[1, 2]
[3, 4]]
``````
Storage in memory PyTorch `contiguous`? Generally â€ścontiguousâ€ť in memory-space?
`1 2 3 4 0 0 0`
`1 3 2 4 0 0 0`
`1 0 2 0 3 0 4`

in short:

when is a case that we do need to call `contiguous`?

Iâ€™m not completely sure, but I assume the `contiguous` call copies the memory to make it continuous again.

Yes, it does.

``````t1 = torch.arange(9).view(3,3)
t2 = torch.as_strided(t1, size=(3,3),stride=(1,3))
t3 = t2.contiguous()
assert t2.storage().data_ptr() != t3.storage().data_ptr()
``````