Should i use softmax on the last layer with Cross entropy loss for a binary classification? Is there a cheat sheet out there on how to pair up the last layer and loss criteria?
The docs for the loss functions (e.g. nn.CrossEntropyLoss) provide the necessary information on how to pass the inputs and targets.
nn.CrossEntropyLoss
expects logits, so you shouldnât apply a softmax on your model outputs.
So I donât see a âlogitsâ in the docs. I just let the linear layers run when constructing the model class and apply the loss to whatever the final layer puts out?
Yes, just pass the output of your last (linear) layer directly to nn.CrossEntropyLoss
, as internally F.log_softmax
and nn.NLLLoss
will be applied.
In the docs âscoresâ is used, so you are right about the missing âlogitsâ.
The input is expected to contain raw, unnormalized scores for each class.
Thatâs awesome. Thanks.
Hi @ptrblck. I expect when i take the exponent of the predicted values after sending the data through, they would equal 1 after being summed. I donât get that. Am I missing something?
If you apply F/log_softmax
manually, the exponent of this output should sum to one.
Which output are you using at the moment? The model output or the (unreduced) loss function output?
the out put is just the output layer by itself. I will switch to log_softmax(x)?
In that case, you cannot expect the exponent of logits to sum to one.
To get the probabilities, you could use softmax on the output of the last layer.
Just make sure to not pass the softmaxâed output to nn.CrossentropyLoss
.
So i use Log_softmax()
on output with nn.CrossentropyLoss
? Or log_softmax()
with NLLloss()
You can use:
- raw logits (no activation function at the end, just the raw output of the last layer) +
nn.CrossEntropyLoss
-
F_log_softmax
on the model output +nn.NLLLoss
If you need to see the probabilities for debug/printing purpose:
- use softmax and print the output
- use
exp()
and print the output
Make sure to not pass the probabilities to the loss functions.