Get error message "MaskedFill can't differentiate the mask"

Chun_Li · October 27, 2017, 4:23am

My a program was able to run in earlier version 0.2.0+0eec332 of pytorch (date back to Oct. 6, 2017).
Yesterday and today I upgraded my pytorch to the latest version from source, and now I can’t run
the program. The error message I got is as follows

File “/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py”, line 85, in setitem
return MaskedFill.apply(self, key, value, True)
File “/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/tensor.py”, line 483, in forward
assert not ctx.needs_input_grad[1], "MaskedFill can’t differentiate the mask"
AssertionError: MaskedFill can’t differentiate the mask

Don’t know what happens. Can anyone help on this? Thanks in advance.

SimonW · October 27, 2017, 5:11pm

Can you try variable[key.detach()] = value instead?

Interestingly, I don’t see any relevant code changes that happened within the last month from a quick look.

Chun_Li · October 27, 2017, 5:45pm

I don’t know how to use variable[key.detach()], will look at it later.
If necessary, I can paste my code here.

I google searched the following phrase

assert not ctx.needs_input_grad[1], “MaskedFill can’t differentiate the mask”

which return to page

github.com

pytorch/pytorch/blob/master/torch/autograd/_functions/tensor.py

from functools import reduce
import torch

from ..function import Function
from ..variable import Variable


class Type(Function):

    @staticmethod
    def forward(ctx, i, dest_type):
        ctx.input_type = type(i)
        ctx.input_device = -1 if not i.is_cuda else i.get_device()
        return i.type(dest_type)

    @staticmethod
    def backward(ctx, grad_output):
        if ctx.input_device == -1:
            return grad_output.type(ctx.input_type), None
        else:

This file has been truncated. show original

Don’t know if this code changes.

Usually I installed pytorch from source on my machine for both python 2.7 and python 3.5.
My last working version of pytorch for python 3.5 actually was installed about earlier of this Sept.,
which was uninstalled and upgraded to the latest one. My current working version of pytorch for
python 2.7 was installed in this Oct. 6, and I have to keep it NOW.

SimonW · October 27, 2017, 6:49pm

By variable[key.detach()] = value I meant detach your index variable before trying to index into another variable using it. I suppose you didn’t explicitly call variable.__setitem__(key, value). If you post the code, I can take a look.

Chun_Li · October 27, 2017, 8:46pm

My code is as follows:

class MyReLU(torch.autograd.Function):

    @staticmethod
    def forward(ctx, input):
        ctx.save_for_backward(input)
        output = input.clamp(min=0)
        return output

    @staticmethod
    def backward(ctx, grad_output):
        input, = ctx.saved_variables
        grad_output[input < 0] = 0
        return grad_output

myRelu = MyReLU.apply

…

which runs well in the earlier version of pytorch, and fails in the latest version of pytorch
(up to yesterday), and gave out error message in the above first post. However, after just
happen to read a related post

I modified the code to

class MyReLU(torch.autograd.Function):

    @staticmethod
    def forward(self, input):
        self.save_for_backward(input)
        output = input.clamp(min=0)
        return output

    @staticmethod
    def backward(self, grad_output):
        input, = self.saved_tensors
        grad_output[input < 0] = 0
        return grad_output

and it works now in the both old and newer version of pytorch! Interesting, but I don’t know why!
If you or someone can explain this, that would be appreciated!

SimonW · October 27, 2017, 10:11pm

This is effectively equivalent with what I suggested. In your 2nd code snippet, input is a tensor, which do not require gradient. Hence no error.

In fact, all your code can just be simplified to either use default ReLU or just directly call .clamp. What’s the purpose of writing this function?

Chun_Li · October 28, 2017, 2:18am

Thanks for the explanation.

The above code snippet is originally come from

http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-defining-new-autograd-functions

with minor difference that @staticmethod is added in my code. Without this @staticmethod
my program also raises error. The only difference is that I use myRelu = MyReLU.apply and
this MyReLU class in another model class, while in the above source link, it uses
myRelu = MyReLU() method and directly runs it in main program.

Yes, this piece of code is simply replicated the default / standard ReLU function.
The reason I want to exercise it is that it reveals inner characters of backpropagation
algorithm, and I may implement some more complicated activation function!
I want both forward and backward passes, and can’t only call .clamp.
In addition, code piece like “grad_output[input < 0] = 0” is elegant, which seems hard
to find in other frameworks or languages. I highly appreciate the work of PyTorch!

Chun_Li · October 28, 2017, 5:58pm

OK, I just tried call .clamp directly as ReLU function without backward pass implement,
and it works without problem! This is awesome, and means that .clamp likes regular +, /, etc,
are all qualified Torch math operations that we don’t have to implement their backward pass
by our own!

colesbury · October 28, 2017, 10:08pm

@Chun_Li, this is a bug in PyTorch master. I wrote up an issue here:

Chun_Li · October 28, 2017, 11:39pm

So, not just me feel the problem within the recent versions of pytorch.
Waiting for the bug fix. Thanks a lot for the works of pytorch team!