Different behaviors on slices

wowwokele · September 2, 2018, 2:33pm

Hi!

I’ve found this strange behavior which I think is unexpected.
Running the following code, the last 2 printed outputs are different though I would expect the 2 operations to be equivalent:

import torch
x = torch.randn(3, 3, 1, 2)
print(x[:, 2, ...])
print(torch.tanh(x[...]))
print(torch.tanh(x[:, 2, ...]))
print(torch.tanh(x[:, 2, ...].contiguous()))

It seems that torch.tanh(x[:, 2, ...]) gives the wrong result.

Is this supposed to happen? If so, why?

Thanks in advance

PyTorch version: 0.4.1
Python version: 3.6

TheShadow29 · September 3, 2018, 1:01am

What .contiguous() does is makes the tensor continuous in the memory disk. What is happening here is that torch.tanh is taking in the input x[:, 2, ...] and starts with the first element. Now in the disk where this is written the next elements are not what would expect in x[:, 2, ...] rather those of x[...]. So it takes in those inputs instead. This is resolved by doing .contiguous() which places them in the order you expect.

Hope this helps.

wowwokele · September 3, 2018, 9:36am

Hi! thank you for your answer!

Ok, this seems to explain the behavior.
However, why don’t torch.tanh use strides to access the right values? This still seems to me an quite unexpected behaviour and it implies I need to call .contiguous() every time I use torch.tanh. I need to apply a few different non-linearities to different subsets of the channels in a CNN: calling .contiguous() on each sub-set of the channels will end up doing a complete copy of the input tensor at every forward pass, which seems a waste of resources.

Is there a way to avoid copying the whole input tensor? Do yo plan to support tensor’s strides?

Thanks in advance

TheShadow29 · September 3, 2018, 7:37pm

I am sorry but I am not a pytorch dev so can’t really help with support thing. I usually use .contiguous() whenever the data points are sliced or permuted and hence not consecutive. torch.tanh won’t cause a problem as long as the data points are consecutive.

One possible solution is to may be permute the axes such that the values which would have the same non-linearity are consecutive. So in your case it would be something like

x1 = x.permute(0, 2, 3, 1).contiguous()
print(torch.tanh(x1))
print(torch.tanh(x1[:, :, :, -1]))
print(torch.tanh(x1[:, :, :, -1].contiguous()))

The last two lines would return the same result

ptrblck · September 3, 2018, 8:01pm

This doesn’t seem to be right.
The workaround of @TheShadow29 might work, but it still seems to be weird.
Thanks for reporting it!

ptrblck · September 5, 2018, 9:13pm

@wowwokele @TheShadow29 This issue should be fixed in the current master. PR

fangyh · September 6, 2018, 3:29pm

I saw the fixed code. It seems related to uncontiguous storage.
I am a bit confused why it related to UnaryOps.
What’s the main difference between UnaryOps and NonUnaryOps?