Log_softmax and softmax

ccalafiore · October 30, 2020, 12:40pm

I am training a model with nn.LogSoftmax(). Well, I am actually using torch.nn.CrossEntropyLoss() as loss funtion which combines nn.LogSoftmax() and nn.NLLLoss(). However, During the test phase, I want the model to output the probabilities, not the logs of probabilities. So, I would like to use nn.Softmax() for the test phase, not nn.LogSoftmax().

My Question: after training with nn.LogSoftmax(), can I test the model with nn.Softmax() instead of nn.LogSoftmax()? Or do I need to train the model with nn.Softmax() to be able to use nn.Softmax() during the test phase?

Thanks, Carmelo

KFrank · October 30, 2020, 2:02pm

Hello Carmelo!

You can do anything you want in your test phase, regardless of how
you trained your model. (Just make sure that your test-phase results
mean what you want them to mean.)

You say you use CrossEntropyLoss. The output of your model
should therefore be raw-score logits, and these will most likely
be the output of a final Linear layer without any subsequent
activation function. That’s fine.

If you now want to do something in your test phase with the actual
probabilities (instead of the logits), pass the output of your model
through softmax() to convert them.

(You should think through why you want probabilities in your test
phase and whether or not you need them. If all you want to do is
convert those probabilities to integer class-label predictions, you
can do that by applying agrmax() directly to the logits without
converting them to probabilities first.)

Best.

K. Frank