Two output nodes for binary classification

  1. For a binary classification use case, you could use a single output and a threshold (as you’ve explained) or alternatively you could use a multi-class classification with just two classes, so that each class gets its output neuron. The loss functions for both approaches would be different.
    In the first case (single output), you would use e.g. nn.BCEWithLogitsLoss and the output tensor shape should match the target shape.
    In the latter case, you would use e.g. nn.CrossEntropyLoss and the target tensor shape should contain the class indices in the range [0, nb_classes-1] and miss the “class dimension” (usually the channel dim).

Both approaches expect logits, so you should remove your softmax layer and just pass the last output to the criterion.

  1. A final linear layer is not strictly necessary, if you make sure to work with the right shapes of your output and target.
7 Likes