Masked_fill_ on non-contiguous data

I’ve encountered a weird bug in my code, and wondered whether this is expected behaviour. This might be the case, in which I would say that documentation should be more clear on this, or that warning should be raised. Things like these make Pytorch code very prone to bug, I reckon.

Somewhere in my network, I wanted to do an element-wise tensor multiplication, but before that, some values needed to be changed. My solution was to expand the tensor and perform a masked_fill_. However, the expand function would make the data and non-contiguous and set the stride to 0. Calling masked_fill_ after this would not raise any warning, but actually masked other values than specified by the mask. I guess you could say that in terms of what functionality masked_fill_ should provide, this would be unexpected behavior, as we fill more than the mask.

import torch

my_data = torch.zeros(2)
mask = torch.ByteTensor([[0, 0], [0, 1]])

# dimensions [2] to [2 x 2]. Makes it non-contiguous, sets stride to 0
my_data = my_data.expand_as(mask)

# Original data
#tensor([[ 0.,  0.],
#        [ 0.,  0.]])

# Masked fill on non-contiguous data
#tensor([[ 0.,  1.],
#        [ 0.,  1.]])
print(my_data.masked_fill_(mask, 1))

# Masked fill on contiguous data
#tensor([[ 0.,  0.],
#        [ 0.,  1.]])
print(my_data.contiguous().masked_fill_(mask, 1))