If I have a loss function is the form `torch.log(-B*torch.exp(X))`

what should be the best way to tackle the `torch.exp`

and `torch.log`

from getting nan.

I am assuming X is a real tensorâ€¦

What is the value of B? If it is positive, you will get nanâ€¦

However, if it is negative, you can do

```
torch.log(torch.tensor(-B)) + X # since torch.log(torch.exp(X)) => X
```

to avoid really high values in expâ€¦

the actual computation is `log \mathbf{E} \Big[ -B*torch.exp(X) \Big] = torch.log( torch.mean ( -B*torch.exp(X) )) `

.

And yes both `B`

and `X`

are tensors and output of two different neural networks. ex: `B = NN_1(b)+some_additional_calculation`

and `X = NN_2(x)+some_additional_calculation`

logsumexp exists to tackle this case using identity:

log(exp(a)+exp(b)) = c + log(exp(a-c) + exp(b-c))

c=max(a,b)

You can adapt this for scaling and mean with: K*exp(a) = exp(log(K))*exp(a) = exp(a+log(K))

Or just use .clamp() on problematic tensors.

1 Like

Thanks a lot for pointing that out!