I was looking at the MNIST example and it had the line fo code:

```
return F.log_softmax(x, dim=1)
```

then later uses:

```
loss = F.nll_loss(output, target)
```

what I don’t understand is why does the MNIST example do that instead of just outputting `x`

and the using the `torch.nn.CrossEntropy`

criterion layer?

7 Likes

I think it’s just a matter of taste.

When debugging the model, I like to “see” directly the probabilities of the output classes instead of comparing the logits.

In this example, I prefer to look at the `log_prob`

and the difference between each prediction instead of comparing each row of `logits`

.

```
logits = Variable(torch.randn(10, 3))
log_prob = F.log_softmax(logits, dim=1)
```

Just use whatever fits your needs.

4 Likes