With the following code:

`s = torch.softmax(torch.randn(100), dim=0)`

`d = torch.zeros(100)`

`d[43] = 1`

`loss = torch.nn.KLDivLoss()`

`loss(s,d)`

And somehow this returns a negative value, why is it and how am I supposed to fix it?

Thank you.

So, dump the softmax

I was calculating KL Divergence loss, and it is negative, that leads me to here. Thanks for all your previous answers.

This is the mathematical proof of why KLDLoss should be above zero:

The cornerstone of the proof is that for KLDLoss(p, q), sum(q) needs to equal one to make sure the loss is above zero. So even if you have p = log_softmax(tensor), you might still get negative values if your target is not a true distribution: sum(q) != 1

Also, see the awesome discussion here: