nn.CrossEntropyLoss
uses F.log_Softmax
and F.nll_loss
internally. If you use a softmax
of sigmoid
you would pass unexpected values to the criterion and would then additionally apply log_softmax
on it.
1 Like
Makes sense. Thanks.