Hi,

I found sometimes the bce loss hit -inf with my bce loss. My implementation is shown below:

```
# imagine the model ouputs a tensor with shape (N, 2)
# GT is one-hot encoding of shape (N, 2)
prob = torch.softmax(output, dim=1)
loss = torch.sum(-torch.log(prob) * gt, dim=1)
loss = torch.mean(loss)
```

In some cases, the model is overconfident and outputs two values with a very large difference, e.g. tensor([[-100, 100]]). The -100 will cause the softmax function give 0 and then torch.log() gives -inf.

Is there any way to deal with it ? Thanks a lot.

Best,