I’ve encountered a weird bug in my code, and wondered whether this is expected behaviour. This might be the case, in which I would say that documentation should be more clear on this, or that warning should be raised. Things like these make Pytorch code very prone to bug, I reckon.
Somewhere in my network, I wanted to do an element-wise tensor multiplication, but before that, some values needed to be changed. My solution was to expand the tensor and perform a masked_fill_. However, the expand function would make the data and non-contiguous and set the stride to 0. Calling masked_fill_ after this would not raise any warning, but actually masked other values than specified by the mask. I guess you could say that in terms of what functionality masked_fill_ should provide, this would be unexpected behavior, as we fill more than the mask.
import torch
my_data = torch.zeros(2)
mask = torch.ByteTensor([[0, 0], [0, 1]])
# dimensions [2] to [2 x 2]. Makes it non-contiguous, sets stride to 0
my_data = my_data.expand_as(mask)
# Original data
#tensor([[ 0., 0.],
# [ 0., 0.]])
print(my_data)
# Masked fill on non-contiguous data
#tensor([[ 0., 1.],
# [ 0., 1.]])
print(my_data.masked_fill_(mask, 1))
# Masked fill on contiguous data
#tensor([[ 0., 0.],
# [ 0., 1.]])
print(my_data.contiguous().masked_fill_(mask, 1))