Slice a tensor inplace correctly

vadimkantorov · September 24, 2020, 3:59pm

resize_ doesn’t seem to work, while set_ seems to work. Any idea why?

import torch

B = 2
T = 4

x = torch.rand(B, T)
y = x[:, :T - 1].clone()
print('y', y)
#y tensor([[0.6385, 0.3710, 0.0730],
#        [0.7255, 0.4496, 0.9298]])
# ground truth

b = x.clone()
b.resize_(B, T - 1)
print('b', b)
# b tensor([[0.6385, 0.3710, 0.0730],
#        [0.2039, 0.7255, 0.4496]])
# incorrect

c = x.clone()
reshaped = c[:, :T - 1]
c.set_(c.storage(), 0, reshaped.size(), reshaped.stride())
print('c', c)
# c tensor([[0.6385, 0.3710, 0.0730],
#        [0.7255, 0.4496, 0.9298]])
# correct

albanD · September 24, 2020, 4:59pm

Resize_ is just a low level construct that makes b a contiguous Tensor with the given size (increasing the storage if needed).
Note that it does not initialize the memory or guarantee re-used of the existing storage. So you can get arbitrary values after calling it.
Resize is not doing slicing!

set_ changes the Tensor and metadata of the Tensor. In this case, you set a new storage, offset, size and stride. And they happen to be what you expect?

vadimkantorov · September 24, 2020, 5:04pm

I thought that resize_ does a similar set_ under the hood (and adjusting the strides). If it’s not doing, that I guess there should be an op that does that… Or at least some explicit mention in docs about this usecase.

albanD · September 24, 2020, 5:06pm

I am not sure to understand what you’re trying to do here? What would be the matching out of place code?
Also what is the benefit of doing a view op inplace??

vadimkantorov · September 24, 2020, 5:11pm

I’m calling resize_to_min_size_(tensorA, tensorB, *many_more_tensors, dim = -1)

def resize_to_min_size_(*tensors, dim = -1):
    tensor_dim_slice = lambda tensor, dim, dim_slice: tensor[(dim if dim >= 0 else dim + tensor.dim()) * (slice(None), ) + (dim_slice, )]
    size = min(t.shape[dim] for t in tensors)
    for t in tensors:
        if t.shape[dim] > size:
            reshaped = tensor_dim_slice(t, dim, slice(size))
            t.set_(t.storage(), 0, reshaped.size(), reshaped.stride())

This is to save for errors when adding tensors to the call line. This is a workaround for absence of SAME padding mode while doing some audio processing.

Basically this is some inplace slicing.

albanD · September 24, 2020, 5:18pm

But why doesn’t this return a new list of updated Tensors?
It will be most efficient as you skip the set call.
It will be less dangerous as if anything else has a reference to an element of tensors in this function, they will all see the new size.

vadimkantorov · September 24, 2020, 5:22pm

Just to save keystrokes and avoid typing down all the tensor names twice - this is just some code for hacking

Without this it would be: a, b, c, d, e, f, g = slice_to_size(a, b, c, d ,e, f, g). Very cumbersome if names are long and similar-ish.

vadimkantorov · September 24, 2020, 5:32pm

Of course I agree that this is a dangerous operation (as is resize_ or set_). This is just to mention that resize_ semantic was not very clear (wrt to strides). Maybe it should accept a flag that would also adjust the strides.

albanD · September 24, 2020, 5:40pm

I think resize_() should be retired
And the doc already mentions this semantic very clearly in the doc: https://pytorch.org/docs/stable/tensors.html#torch.Tensor.resize_ (see the warning)

Another point is that autograd semantic for inplace view is weird at best And we currently don’t have any inplace view which would be a brand new concept we would need to introduce and support.

vadimkantorov · September 24, 2020, 6:14pm

I always have to scratch my head when I need to decide if I need to change strides or not so I didn’t give a very careful thought to the warning

vadimkantorov · September 24, 2020, 7:57pm

But for other people like me, maybe it’s worth expanding the warning to explain that in this case modifying strides would be needed