Hello I have two cross entropy loss functions A and B
the result is best when A is small and B as big as possible hence I could write it as A-B
However, by writing like that I do not have a lower bound - loss can get to arbitrary large negative numbers and I am training on float 16 so I am afraid that numerical instability will be the issue
Another possibility is A + 1/B but I can not guarantee that B will not be 0
What is the best way to deal with it?
Hi Jakub!
Without knowing your use case nor where A
and B
come from, it’s hard
to recommend a particular choice.
However, torch.exp (-B)
would solve your immediate problem. It becomes
smaller as B
becomes larger and is bounded below by zero.
If you were concerned about exp (-B)
overflowing to inf
as B
becomes
large and negative, you could use torch.sigmoid (-B)
. It is bounded
above by one and approaches exp (-B)
(and hence zero) as B
becomes
large and positive.
Best.
K. Frank
The likelihood is a number between 0 and 1, and the cross_entropy_loss is -ln(likelihood).
For A and B, we have likelihood_A = exp(-A) and likelihood_B = exp(-B).
In general, we want to maximize the likelihood (i.e. minimize the cross_entropy_loss), but from what I understood the opposite is true for likelihood_B.
For this case, my suggestion is to maximize likelihood_A * (1 - likelihood_B), that is, to minimize A - ln(1 - exp(-B)).
Any bounded, continuous, and strictly increasing function of A - ln(1 - exp(-B)) works too (e.g. tanh).