How to use BCE loss and CrossEntropyLoss correctly?

ptrblck · July 14, 2020, 11:02am

The docs will give you some information about these loss functions as well as small code snippets.

For a binary classification, you could either use nn.BCE(WithLogits)Loss and a single output unit or nn.CrossEntropyLoss and two outputs.
Usually nn.CrossEntropyLoss is used for a multi-class classification, but you could treat the binary classification use case as a (multi) 2-class classification, but it’s up to you which approach you would like to use.

If you are using the former approach, we generally recommend to use nn.BCEWithLogitsLoss and pass raw logits to this criterion, as it will yield better numerical stability than sigmoid + nn.BCELoss.

The latter use case also expects raw logits, which can be passed to nn.CrossEntropyLoss.