# Loss class implementation for KLDivLoss

I am testing the KLDivLoss implementation here: https://github.com/liuzechun/ReActNet/blob/465f9ba458b3937915e5e5613a85b74123d9ff00/utils/KD_loss.py#L8

It can be simplified as

class DistributionLoss(_Loss):
def forward(self, model_output, real_output):

model_output_log_prob = F.log_softmax(model_output, dim=1)
real_output_soft = F.softmax(real_output, dim=1)
del model_output, real_output

real_output_soft = real_output_soft.unsqueeze(1)
model_output_log_prob = model_output_log_prob.unsqueeze(2)

cross_entropy_loss = -torch.bmm(real_output_soft, model_output_log_prob)
cross_entropy_loss = cross_entropy_loss.mean()

return cross_entropy_loss



If we define KLDivLoss as KL_loss = - \sum P(x) * log (Q(x) / P(x)) for distribution P and Q. Then, I couldn’t find anywhere in the code above has the division between Q and P (or substraction between log(Q) and log( P)).

The following iPython history confirms what I suspected. However, when putting into the inherent class of _Loss, it just magically worked out. I couldn’t understand why. Please help.

Line [60] is running the code outside the class => wrong answer.
Line [61] is running the code inside the class => correct answer.
Line [62] is running the “correct” code outside the class => correct answer.

In [60]: -torch.bmm(F.softmax(outputs_teacher, dim=1).unsqueeze(1), F.log_softmax(outputs, dim=1).unsqueeze(2)).mean()