# classification using LogSoftmax vs Softmax and calculating precision-recall curve?

In case of binary classification we could get final output using LogSoftmax or Softmax. In case of softmax we get results that add up to 1. I understand that LogSoftmax penalizes more for a wrong classification and few other mathematical advantage.

I have binary classification problem with class 1 occurring very rarely (<2% times)

my questions:

1. If I am using probability cutoff of 0.5 (predicting to class 1 if prob is above 0.5) with Softmax then will I get same values for overall accuracy, class 1 - recall, precision and f1 as when using LogSoftmax (and using the lower value of output as prediction class)?
2. How to calculate precision-recall curve when using LogSoftmax ? This link says that â€śThe precision-recall curve is constructed by calculating and plotting the precision against the recall for a single classifier at a variety of thresholds.â€ť How are we going to choose those thresholds if output is not between 0 to 1?

No, `LogSoftmax` doesnâ€™t penalize a â€śmore wrongâ€ť classification but applies the `log` to the `softmax` output in a numerically stable way.

1. If you are working on a binary classification use case and are thinking about using a threshold, I assume your output has the shape `[batch_size, 1]` and you would be using `nn.BCE(WithLogits)Loss`. In this case, no `(Log)Softmax` would be used, as you have a single output neuron. To get the prediction using a probability threshold you could use `torch.sigmoid(output_logits) > threshold`.

2. Again, `LogSoftmax` is used for e.g. `nn.NLLLoss` and a multi-class classification.

1 Like

I am still not clear.

in both `softmax` and `logsoftmax` case, my neural network output has shape `[batch_size,2]` and in both cases I am using` cross_entropy(probs,labels)`. Truth labels have shape `[batch_size,1]`. Do I need to change anything?

question 3. based upon your comment, is using `logsoftmax` not useful as I am using `cross_entropy`?

could you answer my questions 1 and 2?

1. just trying to rephrase - would `logsoftmax` and `softmax` give exact same output?
2.if i am using `logsoftmax` then how to get precision-recall curve

Both, `LogSoftMax` and `Softmax` are wrong if you are using `nn.CrossEntropyLoss` as raw logits are expected and `nn.CrossEntropyLoss` will internally apply `LogSoftmax`.
In the multi-class classification, your target should have the shape `[batch_size]` and contain the class indices in `[0, nb_classes-1]`. Based on your description you are not working with `nn.BCEWithLogitsLoss` but are using `nn.CrossEntropyLoss` for a â€ś2-class multi-class classificationâ€ť. @KFrank describes the difference in your cross-post.
That is correct and you should remove it if you are using `nn.CrossEntropyLoss`. `LogSoftmax` is used with `nn.NLLLoss`.
No, since `LogSoftmax` applies the logarithm to the `Softmax` output.
As described before (and in your cross-post) you might want to switch to `nn.BCEWithLogitLoss` for a binary classification to be able to use thresholds to create the predictions.