I’ve found this strange behavior which I think is unexpected.
Running the following code, the last 2 printed outputs are different though I would expect the 2 operations to be equivalent:
x = torch.randn(3, 3, 1, 2)
print(x[:, 2, ...])
print(torch.tanh(x[:, 2, ...]))
print(torch.tanh(x[:, 2, ...].contiguous()))
It seems that
torch.tanh(x[:, 2, ...]) gives the wrong result.
Is this supposed to happen? If so, why?
Thanks in advance
PyTorch version: 0.4.1
Python version: 3.6
.contiguous() does is makes the tensor continuous in the memory disk. What is happening here is that
torch.tanh is taking in the input
x[:, 2, ...] and starts with the first element. Now in the disk where this is written the next elements are not what would expect in
x[:, 2, ...] rather those of
x[...]. So it takes in those inputs instead. This is resolved by doing
.contiguous() which places them in the order you expect.
Hope this helps.
Hi! thank you for your answer!
Ok, this seems to explain the behavior.
However, why don’t
torch.tanh use strides to access the right values? This still seems to me an quite unexpected behaviour and it implies I need to call
.contiguous() every time I use
torch.tanh. I need to apply a few different non-linearities to different subsets of the channels in a CNN: calling
.contiguous() on each sub-set of the channels will end up doing a complete copy of the input tensor at every forward pass, which seems a waste of resources.
Is there a way to avoid copying the whole input tensor? Do yo plan to support tensor’s strides?
Thanks in advance
I am sorry but I am not a pytorch dev so can’t really help with support thing. I usually use .contiguous() whenever the data points are sliced or permuted and hence not consecutive.
torch.tanh won’t cause a problem as long as the data points are consecutive.
One possible solution is to may be permute the axes such that the values which would have the same non-linearity are consecutive. So in your case it would be something like
x1 = x.permute(0, 2, 3, 1).contiguous()
print(torch.tanh(x1[:, :, :, -1]))
print(torch.tanh(x1[:, :, :, -1].contiguous()))
The last two lines would return the same result
This doesn’t seem to be right.
The workaround of @TheShadow29 might work, but it still seems to be weird.
Thanks for reporting it!
@wowwokele @TheShadow29 This issue should be fixed in the current master. PR
I saw the fixed code. It seems related to uncontiguous storage.
I am a bit confused why it related to UnaryOps.
What’s the main difference between UnaryOps and NonUnaryOps?