Suppose I have a tensor with some unknown number of NaNs and Infinities. I do not know how many I expect, and therefore need to mask them as part of a model.

I have tried:

```
x = torch.randn(B,C,X, Y)
x_fix = torch.zeros_like(x)
x = torch.where( (torch.isnan(x)) | torch.isinf(x), x, x_fix)
```

This, however, causes nans in my loss function. I assume this is because the function is non-differentiable.

How would I go about solving this problem?