Hello

I am taking a course on deep learning - As part of the course work I have to build a project on CNN Classification - In my project I used SoftMax activation for the output layer as I am interested in the probabilities instead of scores. I also used Cross Entropy loss function as it worked better for my problem. I got the expected results - I am already aware the Cross Entropy loss function uses the combination of pytorch log_softmax & NLLLoss behind the scene.

But my project submission was rejected and the reviewer comment was that Softmax activation *should not* be used with Cross Entropy Loss Function per Pytorch documentation.

Seeking help from experts here to understand why Softmax and Cross Entropy Loss function should not used together in Pytorch.

Much appreciate your help

Thatâ€™s correct and youâ€™ve already described the reason as:

If you apply a `softmax`

on your output, the loss calculation would use:

```
loss = F.nll_loss(F.log_softmax(F.softmax(logits)), target)
```

which is wrong based on the formula for the cross entropy loss due to the additional `F.softmax`

.

Thank you - If I need the probabilities as the model outcome during inference time, what should I do? I tried softmax function outside the model architecture but at the time of model inference but I lose the TopK function. If I need probablities as well as the functions like topk, what should I do?

Thanks

Ramaiah

Applying `softmax`

to the output â€śoutside the modelâ€ť and not passing it to the loss function would be the proper way top get the probabilities.

I donâ€™t fully understand this description, as `softmax`

wonâ€™t change the order:

```
logits = torch.randn(10, 10)
preds = torch.topk(logits, k=3, dim=1).indices
prob = F.softmax(logits, dim=1)
preds_prob = torch.topk(prob, k=3, dim=1).indices
print((preds==preds_prob).all())
# > tensor(True)
```

1 Like