can we obtain the BCEWithLogitsLoss as weighting
or it just BCE loss function combined with a sigmoid activation function?
No. They take different inputs – logits vs. probabilities – and you can’t
compensate for that difference with weighting.
Mathematically yes. In its numerical implementation,
logsigmoid() instead of
log (sigmoid()) for
reasons of numerical stability.