This is my personal opinion and others might have different preferences, so take it with a grain of salt
For a multi-class classification, I would use nn.CrossEntropyLoss
, which also provides the ignore_index
argument. This makes sense, as e.g. if I’m dealing with 1000 classes, I might just want to ignore a certain one.
In a binary classification, you could still use nn.CrossEntropoyLoss
with two outputs (possibly more, if you ignore this class) or alternatively nn.BCE(WithLogits)Loss
.
An ignore_index
argument doesn’t really make sense in the latter case, since we are dealing with float values, and we are just using a single output neuron, which should give us the probability (logit) of the positive class. Ignoring a class in a binary setup seems a bit strange, and it might be simpler to just calculate the loss of a single class instead (if that’s the use case).
For a multi-label classification, I would use also use nn.BCE(WithLogits)Loss
, where each neuron corresponds to the probability (logit) of the corresponding class.
Ignoring certain classes in this use case could in fact make sense.