Error in the KL loss of pytorch

DidierDeschamps · February 19, 2020, 6:38pm

https://pytorch.org/docs/stable/nn.html#kldivloss

the formula used is

l_n = y_n * (log(y_n) - x_n)

while it should be

l_n = y_n * (log(y_n) - log(x_n))

this could be solved with

loss(log(x), y)

instead of

loss(x, y)

But this is not pretty …

Did you do it on purpose and if you did why?

tom · February 19, 2020, 7:26pm

As it says on the page

the input given is expected to contain log-probabilities and is not restricted to a 2D Tensor. The targets are given as probabilities (i.e. without taking the logarithm).

So it’s not an error.

What happens is that starting from probs and then taking the logarithm is less numerically stable than directly taking the log as an argument.
Also, it’s the same convention as NLLLoss.

Best regards

Thomas