I found sometimes the bce loss hit -inf with my bce loss. My implementation is shown below:
# imagine the model ouputs a tensor with shape (N, 2) # GT is one-hot encoding of shape (N, 2) prob = torch.softmax(output, dim=1) loss = torch.sum(-torch.log(prob) * gt, dim=1) loss = torch.mean(loss)
In some cases, the model is overconfident and outputs two values with a very large difference, e.g. tensor([[-100, 100]]). The -100 will cause the softmax function give 0 and then torch.log() gives -inf.
Is there any way to deal with it ? Thanks a lot.