Hi I’m trying to do kl distance (optimize kl loss) between two tensors, the problem is that those tensors hold small numbers, around 1e-6. why it’s a problem because I need to normalize the teacher tensor using softmax this gives me a uniform distribution, which is not the case.
How can I overcome this issue, my only thought is to multiply all the values using a high constant .
My code is:
def MyKlLoss(output,target):
shapeto = [output.shape[0]*output.shape[1]*output.shape[2],output.shape[3]*output.shape[4]]
o = output.view(shapeto)
t = target.view(shapeto) ### t holds very low values (1e-5, 1e-6,..)
o = F.log_softmax(output.view(shapeto),dim=-1)
t = F.softmax(target.view(shapeto),dim=-1)## t is becoming a uniform dist, all 0.1111
return F.kl_div(o,t)
I’ll be happy is someone has any idea how to make kl loss meaningful in that case.
Thanks!