In my forward function, I want to set all masked value be -inf after a sigmoid operation, here is my code:
def forward(self, x, mask):
x = self.bn(self.linear1(x))
x = self.linear2(x).squeeze(-1)
score = torch.sigmoid(x)
score[mask] = -math.inf
return score
when execute the backward propogation, I got
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
but if I change the order of inplace operation and sigmoid function, this error will not occur:
def forward(self, x, prev_w, mask):
x = self.bn(self.linear1(x))
x = self.linear2(x).squeeze(-1)
x[mask] = -math.inf
score = torch.sigmoid(x)
return score
under these circumstances, I also execute an inplace operation(x[mask] = -math.inf), but it works well in backpropagation (though the results are not what I want since the masked value are set to 0 after sigmoid operation, not -inf), so here is my question:
- what is the difference between these two types of codes when executing backpropagation in Pytorch? I know inplace operation is not allowed in backpropagation, but in my understanding,
x[mask] = -math.inf
is also an inplace operation, right?
- how can I correctly get what I want, in other words, set all masked value to be -inf after sigmoid operation and can also properly execute backpropagation?
Thanks for your help!