BCELoss vs BCEWithLogitsLoss

That’s not the case as my post mentioned logits:

To get the predictions from logits, you could apply a threshold (e.g. out > 0.0) for a binary or multi-label classification use case

If you apply sigmoid on the output you should use a threshold as a probability, such as 0.5.
@KFrank explains this in more detail in this post and shares approaches to transform between logit and probability thresholds.

1 Like