CrossEntropyLoss in train vs validation

rbapat · June 24, 2020, 4:26pm

The documentation specifies that CrossEntropyLoss combines both LogSoftmax and NLLLoss, which means that I don’t have to implement nn.LogSoftmax into my model; i can just return the output of my last dense layer. However, since I’m not actually running the loss function for the validation stage, would I need to call nn.LogSoftmax on my model output? This is what the main part of my validation loop looks like:

# model returns nn.Linear(..., num_outputs)
probs = model(data)

label = torch.argmax(label, dim = 1)
preds = torch.argmax(probs, dim = 1)

running_correct += (preds == label).sum().item()

Basically, since im not running the loss function, there is no softmax being run on the model outputs; should i call LogSoftmax on probs before calculating the predictions? If this is the case then I assume it would be better to use NLLLoss and throw a softmax in my model.

Chirath_Pansilu · June 24, 2020, 5:14pm

Yes it will be easier to use logsoftmax on model output and NLLLoss as loss function.
Then when you do validation you should use exponential on model output to get actual probabilities