Wrong result of torch.nn.BCEWithLogitsLoss()

target = torch.tensor([[1.0, 0.0]])
pred = torch.tensor([[0.85, 0.15]])  # A prediction (logit)

criterion = torch.nn.BCEWithLogitsLoss()
print(criterion(pred, target).item())

Shows: 0.5634111166000366

Should be:

tSigm = nn.Sigmoid()

t2 = torch.tensor(0.85)

print( -1*math.log(tSigm(t2).item())-0.0*math.log(1-tSigm(t2).item()))

0.3558650064855657

The loss calculation is not wrong according to the docs and as seen in this example, which uses the “logits” (which looks like probabilities) for both classes in this multi-class example:

target = torch.tensor([[1.0, 0.0]])
pred = torch.tensor([[0.85, 0.15]])  
criterion = torch.nn.BCEWithLogitsLoss(reduction="none")
print(criterion(pred, target))
# tensor([[0.3559, 0.7710]])


print((-1.0)*torch.log(torch.sigmoid(pred[0, 0])))
# tensor(0.3559)
print((-1.0)*torch.log(1-torch.sigmoid(pred[0, 1])))
# tensor(0.7710)

# mean reduction
criterion = torch.nn.BCEWithLogitsLoss()
print(criterion(pred, target))
# tensor(0.5634)

print(((-1.0)*torch.log(torch.sigmoid(pred[0, 0])) + (-1.0)*torch.log(1-torch.sigmoid(pred[0, 1]))) / 2.)
# tensor(0.5634)
1 Like

Thank you!

I am used to the fact that entropy is calculated as the sum of all -p*log(p) and did not take into account that for binary entropy one probability is enough.