Hi, I am confused about how to use torch.nn.NLLLoss. Below is a simple session in the Python REPL. I am expecting the result to be 0.35667494393873245, but I am getting -0.7000. I’d greatly appreciated it if someone could steer me in the right direction on this. Thanks!

>>> import torch
>>> import torch.nn as nn
>>> input = torch.tensor([[0.70, 0.26, 0.04]])
>>> loss = nn.NLLLoss()
>>> target = torch.tensor([0])
>>> output = loss(input, target)
>>> output
tensor(-0.7000)
>>> import math
>>> -math.log(0.70)
0.35667494393873245

I think I can see what’s happening now. I was a bit confused about how NLLLoss works. The calculation below shows that applying the negative log likelihood to an input processed through softmax produces the same result as running the input through log_softmax first, then just multiplying by -1. It also shows that applying CrossEntropyLoss to the raw_input is the same as applying NLLLoss to log_softmax(input). I’m guessing that the log_softmax approach may be more numerically stable than using softmax first, and calculating the log of the result separately.

I understand from your experiment that F.NLL_Loss does not expect the likelihood as input, but the log likelihood (log softmax), do you agree with this assessment?

If that is so, I guess this is a source of lots of errors because I believe most people would guess (wrongly) that you should forward pass the likelihood to F.NLL_Loss. Right?