- Quite lots of operations can produce
`nan`

. In summary, if we canâ€™t tell the result with the operation, that would probably produce `nan`

. e.g.

```
inf - inf -> nan
(-inf) - (-inf) -> nan
inf / inf -> nan
0./0. -> nan
# inf plus inf is still inf
inf + inf -> inf
# inf plus any non-infinite number is still inf
inf + 1 -> inf
# inf divived by zero is still inf
inf / 0 -> inf
```

- Enable detect_anomaly() is OK in debugging, but this would lead to performance degration. torch.isnan() can be used to tell whether a tensor is nan.

Avoiding nan is a rather complex topic. nan could be caused by problematic datas, numerical error (usually in mixed precison training) or code-related bugs. To avoid nan, I think you should first find out why nan occurs.