Tutorial on creating new loss functions

Hello everyone!

I have been thinking on methods for image comparison. An idea that I had would be to compute the mutual information of two images. This is pretty straighforward to do in numpy, but unusable for a loss function as there is no backward method.

One idea I had was to look at the KL divergence loss function and try to do something similar but I got lost in the code. KL divergence and mutual information look somehow alike, with the notorious exception of the joint probability distribution in mutual information.

Maybe you can point me at the correct location of KL divergence in the code? Or maybe you even know somebody who explains how to implement KL divergence (even mutual information) as a loss function and can give me a link to his blogpost?


The CUDA version is implemented here: https://github.com/pytorch/pytorch/blob/5d7770948587bb5f19b929200f20e2eaf7074e5c/aten/src/THCUNN/generic/DistKLDivCriterion.cu
It references some kernels here:

The CPU version is here:

Feel free to reply back if you have questions!

1 Like

Whoa, it has been quite a while since I have not seen C code. I will try to see what I can do with it. Thank you!