Circular shift without generating copy

pepper8362 · October 17, 2022, 8:01am

Hi,

I note I can use torch.roll to do circular shift for a tensor, but this function seems to generate a copy of the original tensor, which can be expensive if the original tensor is large. Is there a way to do circular shift without generating a copy of the original tensor (the underlying mechanism may be something like moving a head pointer in a circular array)?

Thanks!

KFrank · October 17, 2022, 4:08pm

Hi Pepper!

I do not believe that pytorch offers a view (or an inplace) version of roll()
(although hypothetically it could).

This won’t work with the way pytorch stores tensors. Pytorch stores
tensors (ignoring strides) in row-major form, and to preserve this storage
format, roll() has to reorder the tensor’s elements in memory. We can
use .storage() to probe how a tensor is actually stored:

>>> import torch
>>> torch.__version__
'1.12.0'
>>> t = torch.tensor ([[1, 2, 3], [4, 5, 6]])
>>> t.storage()
 1
 2
 3
 4
 5
 6
[torch.storage._TypedStorage(dtype=torch.int64, device=cpu) of size 6]
>>> t_roll = t.roll (1, dims = 1)
>>> t_roll
tensor([[3, 1, 2],
        [6, 4, 5]])
>>> t_roll.storage()
 3
 1
 2
 6
 4
 5
[torch.storage._TypedStorage(dtype=torch.int64, device=cpu) of size 6]

One could imagine implementing a “head pointer” as a more complicated
data structure. Note, such a head pointer couldn’t simply index into a
circular array. (Consider what would happen if you rolled, say, a 5d tensor
along its third dimension.) But implementing such a scheme would require
having many of pytorch’s tensor operations, e.g., matmul(), cumsum(),
tensor indexing, and so on, understand this head-pointer data structure at
some cost in efficiency.

Best.

K. Frank

pepper8362 · October 18, 2022, 3:06am

I see. Thank you very much for your answer!