Hi, I am observing some weird behaviour.
I have made a classifier and I have tried two different output and loss combinations ; 1) Softmax and Cross Entropy and 2) Log Softmax and NLLLoss
When I run them both, they will both have an initial loss of 1.38, but the loss of the logsoftmax and nllloss will continue all the way down to 0.25 where as in softmax and crossentropy it will stop around 0.9.
Even weirder is that they both perform the same on the test set.
Am I right in thinking they are roughly similar so shouldn’t the log softmax and nll loss perform better better on the test set given it has a lower loss?