Hello Surya and Pytorchtester!
To clarify a bit:
Mathematically, BCEWithLogitsLoss
is sigmoid()
followed by
BCELoss
. But numerically they are different, with BCELoss
numerically less stable.
Elaborating on the above, sigmoid()
is not there, because it is
not explicitly part of BCEWithLogitsLoss
. It is hiding in the
log (sigmoid())
version of the “log-sum-exp trick,” in this line
from the c++ code that Pytorchtester posted:
loss = (1 - target).mul_(input).add_(max_val).add_((-max_val).exp_().add_((-input -max_val).exp_()).log_());
Best.
K. Frank