In my forward function, I want to set all masked value be -inf after a sigmoid operation, here is my code:

```
def forward(self, x, mask):
x = self.bn(self.linear1(x))
x = self.linear2(x).squeeze(-1)
score = torch.sigmoid(x)
score[mask] = -math.inf
return score
```

when execute the backward propogation, I got

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

but if I change the order of inplace operation and sigmoid function, this error will not occur:

```
def forward(self, x, prev_w, mask):
x = self.bn(self.linear1(x))
x = self.linear2(x).squeeze(-1)
x[mask] = -math.inf
score = torch.sigmoid(x)
return score
```

under these circumstances, I also execute an inplace operation(x[mask] = -math.inf), but it works well in backpropagation (though the results are not what I want since the masked value are set to 0 after sigmoid operation, not -inf), so here is my question:

- what is the difference between these two types of codes when executing backpropagation in Pytorch? I know inplace operation is not allowed in backpropagation, but in my understanding,

x[mask] = -math.inf

is also an inplace operation, right?

- how can I correctly get what I want, in other words, set all masked value to be -inf after sigmoid operation and can also properly execute backpropagation?

Thanks for your help!