Hello I have two cross entropy loss functions A and B

the result is best when A is small and B as big as possible hence I could write it as A-B

However, by writing like that I do not have a lower bound - loss can get to arbitrary large negative numbers and I am training on float 16 so I am afraid that numerical instability will be the issue

Another possibility is A + 1/B but I can not guarantee that B will not be 0

What is the best way to deal with it?

Hi Jakub!

Without knowing your use case nor where `A`

and `B`

come from, it’s hard

to recommend a particular choice.

However, `torch.exp (-B)`

would solve your immediate problem. It becomes

smaller as `B`

becomes larger and is bounded below by zero.

If you were concerned about `exp (-B)`

overflowing to `inf`

as `B`

becomes

large and negative, you could use `torch.sigmoid (-B)`

. It is bounded

above by one and approaches `exp (-B)`

(and hence zero) as `B`

becomes

large and positive.

Best.

K. Frank

The likelihood is a number between 0 and 1, and the cross_entropy_loss is -ln(likelihood).

For A and B, we have likelihood_A = exp(-A) and likelihood_B = exp(-B).

In general, we want to maximize the likelihood (i.e. minimize the cross_entropy_loss), but from what I understood the opposite is true for likelihood_B.

For this case, my suggestion is to maximize likelihood_A * (1 - likelihood_B), that is, to minimize A - ln(1 - exp(-B)).

Any bounded, continuous, and strictly increasing function of A - ln(1 - exp(-B)) works too (e.g. tanh).