Should I use softmax activation before crossentropy loss backward?ack

acmilannesta · May 19, 2020, 11:52pm

I know nn.CrossEntropyLoss() incoporates softmax. So my model output did not add softmax layer. But when I do the loss.backward(), should I change the outputs using softmax like outputs = torch.softmax(outputs, dim=1).cpu().detach().numpy()?

ptrblck · May 21, 2020, 8:10am

As you said, you should not apply softmax on the outputs before passing it to nn.CrossEntropyLoss.

If you’ve calculated the loss using the raw logits and would now want to get the probabilities for debugging purposes etc., then you could of course apply the softmax afterwards.