Implementation of Binary cross Entropy?

KFrank · October 8, 2020, 10:27pm

Hello Surya and Pytorchtester!

To clarify a bit:

Mathematically, BCEWithLogitsLoss is sigmoid() followed by
BCELoss. But numerically they are different, with BCELoss
numerically less stable.

Elaborating on the above, sigmoid() is not there, because it is
not explicitly part of BCEWithLogitsLoss. It is hiding in the
log (sigmoid()) version of the “log-sum-exp trick,” in this line
from the c++ code that Pytorchtester posted:

loss = (1 - target).mul_(input).add_(max_val).add_((-max_val).exp_().add_((-input -max_val).exp_()).log_());

Best.

K. Frank