I was looking at the MNIST example and it had the line fo code:
return F.log_softmax(x, dim=1)
then later uses:
loss = F.nll_loss(output, target)
what I don’t understand is why does the MNIST example do that instead of just outputting x
and the using the torch.nn.CrossEntropy
criterion layer?
7 Likes
I think it’s just a matter of taste.
When debugging the model, I like to “see” directly the probabilities of the output classes instead of comparing the logits.
In this example, I prefer to look at the log_prob
and the difference between each prediction instead of comparing each row of logits
.
logits = Variable(torch.randn(10, 3))
log_prob = F.log_softmax(logits, dim=1)
Just use whatever fits your needs.
4 Likes