nn.NLLLoss() gives negative result - what it's mean?

laro · July 2, 2022, 1:29pm

I saw code which use nn.NLLLoss() (negative log likelihood loss).

I looked on the results and some loss results (result of nn.NLLLoss()) return negative values.

What is the meaning of negative values of this loss?

ptrblck · July 2, 2022, 6:56pm

I don’t know what a negative loss would represent and guess your inputs are in the wrong range as seen in this example:

criterion = nn.NLLLoss()

output = F.log_softmax(torch.randn(10, 10), dim=1)
target = torch.randint(0, 10, (10,))
loss = criterion(output, target)
print(loss)
# tensor(2.6895)

# wrong inputs
output = F.softmax(torch.randn(10, 10), dim=1) 
loss = criterion(output, target)
print(loss)
# tensor(-0.1101)

nn.NLLLoss expects log probabilities as the model output so make sure F.log_softmax was applied on the model output.

WhatWouldKantDo · July 8, 2024, 1:36pm

Hi all, sorry to revive this very old thread but I have a very similar issue using the GaussianNLLLoss function. My NN creates two tensors (expectation and variance) of the same size as the target tensor. To avoid negative variances, I apply the exponential function to the variance tensor which basically means that the immediate output of the NN is considered the log variance. During the training process, I often observe negative losses making it hard to judge if the optimization is actually converging. What am I missing? Thanks in advance for any hint

KFrank · July 9, 2024, 4:23am

Hi Meister!

It is perfectly reasonable for GaussianNLLLoss to return negative loss
values.

It is true that many loss criteria are never negative and become zero only
for perfect predictions (for example, MSELoss), but there is no logical
requirement for this to be the case.

GaussianNLLLoss computes “likelihoods” internally – these are values
of a probability density function. A probability density function must be
non-negative, but can range up to positive infinity. (For example, the
maximum value of the probability density function for a Gaussian diverges
to infinity as its variance goes to zero. So the log of a probability density
function – the log-likelihood – can range from -inf to inf. Accordingly,
GaussianNLLLoss can return loss values that also range from -inf to inf.

(In contrast, a probability ranges from zero to one, so a log-probability
ranges from -inf to 0.0. NLLLoss takes log-probabilities as input, not
likelihoods nor log-likelihoods. When passed valid inputs, NLLLoss
returns values that range from 0.0 to inf.)

As an aside, it would have been better to post this a a new question, rather
than resurrecting a zombie thread.

Best.

K. Frank