Should i use softmax on the last layer with Cross entropy loss for a binary classification? Is there a cheat sheet out there on how to pair up the last layer and loss criteria?
The docs for the loss functions (e.g. nn.CrossEntropyLoss) provide the necessary information on how to pass the inputs and targets.
nn.CrossEntropyLoss expects logits, so you shouldn’t apply a softmax on your model outputs.
So I don’t see a ‘logits’ in the docs. I just let the linear layers run when constructing the model class and apply the loss to whatever the final layer puts out?
Yes, just pass the output of your last (linear) layer directly to
nn.CrossEntropyLoss, as internally
nn.NLLLoss will be applied.
In the docs “scores” is used, so you are right about the missing “logits”.
The input is expected to contain raw, unnormalized scores for each class.
That’s awesome. Thanks.
Hi @ptrblck. I expect when i take the exponent of the predicted values after sending the data through, they would equal 1 after being summed. I don’t get that. Am I missing something?
If you apply
F/log_softmax manually, the exponent of this output should sum to one.
Which output are you using at the moment? The model output or the (unreduced) loss function output?
the out put is just the output layer by itself. I will switch to log_softmax(x)?
In that case, you cannot expect the exponent of logits to sum to one.
To get the probabilities, you could use softmax on the output of the last layer.
Just make sure to not pass the softmax’ed output to
So i use
Log_softmax() on output with
You can use:
- raw logits (no activation function at the end, just the raw output of the last layer) +
F_log_softmaxon the model output +
If you need to see the probabilities for debug/printing purpose:
- use softmax and print the output
exp()and print the output
Make sure to not pass the probabilities to the loss functions.