My a program was able to run in earlier version 0.2.0+0eec332 of pytorch (date back to Oct. 6, 2017).
Yesterday and today I upgraded my pytorch to the latest version from source, and now I can’t run
the program. The error message I got is as follows
File “/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py”, line 85, in setitem
return MaskedFill.apply(self, key, value, True)
File “/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/tensor.py”, line 483, in forward
assert not ctx.needs_input_grad[1], "MaskedFill can’t differentiate the mask"
AssertionError: MaskedFill can’t differentiate the mask
Don’t know what happens. Can anyone help on this? Thanks in advance.
I don’t know how to use variable[key.detach()], will look at it later.
If necessary, I can paste my code here.
I google searched the following phrase
assert not ctx.needs_input_grad[1], “MaskedFill can’t differentiate the mask”
which return to page
Don’t know if this code changes.
Usually I installed pytorch from source on my machine for both python 2.7 and python 3.5.
My last working version of pytorch for python 3.5 actually was installed about earlier of this Sept.,
which was uninstalled and upgraded to the latest one. My current working version of pytorch for
python 2.7 was installed in this Oct. 6, and I have to keep it NOW.
By variable[key.detach()] = value I meant detach your index variable before trying to index into another variable using it. I suppose you didn’t explicitly call variable.__setitem__(key, value). If you post the code, I can take a look.
which runs well in the earlier version of pytorch, and fails in the latest version of pytorch
(up to yesterday), and gave out error message in the above first post. However, after just
happen to read a related post
and it works now in the both old and newer version of pytorch! Interesting, but I don’t know why!
If you or someone can explain this, that would be appreciated!
with minor difference that @staticmethod is added in my code. Without this @staticmethod
my program also raises error. The only difference is that I use myRelu = MyReLU.apply and
this MyReLU class in another model class, while in the above source link, it uses
myRelu = MyReLU() method and directly runs it in main program.
Yes, this piece of code is simply replicated the default / standard ReLU function.
The reason I want to exercise it is that it reveals inner characters of backpropagation
algorithm, and I may implement some more complicated activation function!
I want both forward and backward passes, and can’t only call .clamp.
In addition, code piece like “grad_output[input < 0] = 0” is elegant, which seems hard
to find in other frameworks or languages. I highly appreciate the work of PyTorch!
OK, I just tried call .clamp directly as ReLU function without backward pass implement,
and it works without problem! This is awesome, and means that .clamp likes regular +, /, etc,
are all qualified Torch math operations that we don’t have to implement their backward pass
by our own!