I know nn.CrossEntropyLoss()
incoporates softmax. So my model output did not add softmax layer. But when I do the loss.backward()
, should I change the outputs using softmax like outputs = torch.softmax(outputs, dim=1).cpu().detach().numpy()
?
As you said, you should not apply softmax
on the outputs before passing it to nn.CrossEntropyLoss
.
If you’ve calculated the loss
using the raw logits and would now want to get the probabilities for debugging purposes etc., then you could of course apply the softmax
afterwards.