[SOLVED] NLLLoss not zero for indentical input and target

JohnCwok · August 28, 2019, 8:58am

I noticed a strange thing with the following input tensor (the 1 is at position 62) :

tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0.]])

and target :

tensor([62])

the following commands gives the following output :

crit = nn.NLLLoss()
soft = nn.Softmax(dim = 1)

crit(input[0,:].unsqueeze(0),target[0].unsqueeze(0))
Out[120]: tensor(-1.)

crit(soft(input[0].unsqueeze(0)),target[0].unsqueeze(0))
Out[135]: tensor(-0.0270)

This is not zero but it should be… why is that so ? The unsqueeze are here so that NLLLoss is satisfied by input dimensions,

SoucheChapich · August 28, 2019, 9:16am

Hello,

I think, NLLLoss() works with log probabilities so you should use a LogSoftmax() instead of Softmax().

SoucheChapich · August 28, 2019, 9:39am

But from the documentation: https://pytorch.org/docs/stable/nn.html you can see that the LogSoftmax() will compute LogSoftmax(xi)=log(exp(xi) / ∑_j exp(xj)). So you are summing 61 times exp(0) = 1 + exp(1) = e in your case. So the behavior is normal. So basically, if you want your loss to be 0, the number at the 62th item should be really high compared to the other ones. Try to put 20 instead of 1 and you will have a loss with 0 value.

Just tell me if I was not clear enough.

JohnCwok · August 28, 2019, 12:00pm

Changing the value of 1 to a higher value solved the issue, thanks for the help. So I guess I can safely implement what I had in mind, the loss will behave as expected.