whats the difference between contigious vs non-contigious tensor and functionality ?
It’s a flag indicating, if the memory is contiguously stored or not.
Let use an example to see, how we can get a non-contiguous tensor.
# Create a tensor of shape [4, 3]
x = torch.arange(12).view(4, 3)
print(x, x.stride())
> tensor([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
> (3, 1)
As you can see, the tensor has the desired shape.
The strides are also interesting in this case. They basically tell us, how many “steps” we skip in memory to move to the next position along a certain axis.
If we look at the strides, we see, that we would have to skip 3 values to go to the new row, while only 1 to go to the next column. That makes sense so far. The values are stored sequentially in memory, i.e. the memory cells should hold the data as [0, 1, 2, 3, ..., 11]
.
Now lets transpose the tensor, and have again a look at the strides:
y = x.t()
print(y, y.stride())
print(y.is_contiguous())
> tensor([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
> (1, 3)
> False
The print statement of the tensor yields the desired transposed view of x
.
However, the strides are now swapped. In order to go to the next row, we only have to skip 1 value, while 3 to move to the next column.
This makes sense, if we recall the memory layout of the tensor:
[0, 1, 2, 3, 4, ..., 11]
In order to move to the next column (e.g. from 0
to 3
, we would have to skip 3 values.
The tensor is thus non-contiguous anymore!
That’s not really a problem for us, except, that some operations won’t work.
E.g. if we try to get a flattened view of y
, we will run into a RuntimeError
:
try:
y = y.view(-1)
except RuntimeError as e:
print(e)
> invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view().
So let’s call .contiguous()
before the view
call:
y = y.contiguous()
print(y.stride())
> (4, 1)
y = y.view(-1)
Now the memory layout is contiguous once again (have a look at the strides) and the view
works just fine.
I’m not completely sure, but I assume the contiguous
call copies the memory to make it continuous again.
That being said, continuous arrays are necessary for some vectorized instructions to work. Also generally they should have some performance advantages, as the memory access pattern on modern CPUs will apparently be used in an optimal way, but I’m really not an expert on this topic, so take these last information with a grain of salt.
@ptrblck. Thank you for your explanation. This is quite helpful!
@ptrblck thanks you for your clear and concise explanation
@ptrblck Thanks for such clear and concise explanation.
Thanks @ptrblck Very clear.
Apologies for being picky with the language. But
The tensor is thus non-contiguous anymore!
is confusing to me.
Did you mean:
The tensor is thus non-contiguous now!
?
It seems like the crux of your whole answer to there is a lot of value, in my opinion, to clarify & correct the grammar if possible.
Thanks for your help & time.
From what I understand this a more summarized answer:
Contiguous is the term used to indicate that the memory layout of a tensor does not align with its advertised meta-data or shape information.
In my opinion the word contiguous is a confusing/misleading term since in normal contexts it means when memory is not spread around in disconnected blocks (i.e. its “contiguous/connected/continuous”).
Some operations might need this contiguous property for some reason (most likely efficiency).
Resource:
What would be the difference, if I change anymore
to now
?
is_contiguous()
returned True
before and returns now False
, thus I think anymore
is the right wording.
That might be the source of the confusion. I’m talking strictly, how tensor.is_contiguous()
is defined.
If that’s the case I think the right wording is to say say that it’s non-contiguous now. Not sure what anymore would mean. Perhaps you meant “not contiguous anymore” I think that makes sense grammatically as far as I understand english.
You mean how it’s defined in pytorch and not in standard computer hardware? Perhaps I have the terms confused, hence my motivation to seek clarification.
I see the confusion now between “non-contiguous anymore” and “not contiguous anymore”.
Thanks for clarifying
Can’t tell if you are being sarcastic or funny (or both! ) but I think it’s clear now thanks to the discussion. thanks ptrblck! I appreciate the help, your posts across the forum are really helpful
Thanks for your clear explanation
That’s a damn good explanation. Thanks
So basically contiguous means the stride == width of a row? So that the operations can glide over the dimensions neatly?
maybe this answer helps you: neural network - PyTorch - contiguous() - Stack Overflow for a different explanation. Hope it helps!
A tensor whose values are laid out in the storage starting from the rightmost dimension onward (that is, moving along rows for a 2D tensor) is defined as
contiguous
. Contiguous tensors are convenient because we can visit them efficiently in order without jumping around in the storage (improving data locality improves performance because of the way memory access works on modern CPUs). This advantage of course depends on the way algorithms visit.Some tensor operations in PyTorch only work on contiguous tensors, such as
view
, […]. In that case, PyTorch will throw an informative exception and require us to call contiguous explicitly. It’s worth noting that callingcontiguous
will do nothing (and will not hurt performance) if the tensor is already contiguous.
Note this is a more specific meaning than the general use of the word “contiguous” in computer science (i.e. contiguous and ordered).
e.g given a tensor:
[[1, 2]
[3, 4]]
Storage in memory | PyTorch contiguous ? |
Generally “contiguous” in memory-space? |
---|---|---|
1 2 3 4 0 0 0 |
||
1 3 2 4 0 0 0 |
||
1 0 2 0 3 0 4 |
in short:
when is a case that we do need to call
contiguous
?
I’m not completely sure, but I assume the
contiguous
call copies the memory to make it continuous again.
Yes, it does.
t1 = torch.arange(9).view(3,3)
t2 = torch.as_strided(t1, size=(3,3),stride=(1,3))
t3 = t2.contiguous()
assert t2.storage().data_ptr() != t3.storage().data_ptr()