How to understand a large result of nn.NLLLoss() with correct predicts?

evilroach · October 24, 2022, 5:50am

I’m learning the usage of torch.nn.NLLLoss() and torch.nn.LogSoftmax(), and I’m confused about the results of them.

For example:

lsm = torch.nn.LogSoftmax(dim=-1)
nll = nn.NLLLoss()
grnd_truth = torch.tensor([1])
# let's say it predicted correctly!!!
raw_logits = torch.tensor([[0.3665, 0.5542, -1.0306]])
logsoftmax = lsm(raw_logits)
final_loss = nll(logsoftmax, grnd_truth)

logsoftmax:
>>> tensor([[-0.8976, -0.7099, -2.2947]])
final_loss:
>>> tensor(0.7099)    # Is it normal getting so large loss when predicted correctly?

# let's try a incorrect prediction
grnd_truth = torch.tensor([0])
nll(lsm(raw_logits), grnd_truth)
>>> tensor(0.8976)    # Is this not enough large?

Obviously, we can see that the mis-classified loss is not large enough, comparing to the loss of a correctly classifing that is near to former.

I read the docs of NLLLoss, and I know its formula. I guessed that the designers must have their reasons to use the formula, but I didn’t have gotten it.

In the fact, the reason of I posting this question, is that I encountered a difficulty to train a triple-classification module with NLLLoss() and LogSoftmax().

Is the goal of a lossing func that make the loss as small as possible when the prediction is right, and vice versa?