[SOLVED] Should I use torch.clamp after torch.sigmoid?

Hello, I’ve tried to write a custom loss function, a Weighted Binary Cross Entropy. As suggested by @miguelvr here:


I’ve tried to use the following function (wrapped inside a class):

def weighted_binary_cross_entropy(output, target, weights=None):
        
    if weights is not None:
        assert len(weights) == 2
        
        loss = weights[1] * (target * torch.log(output)) + \
               weights[0] * ((1 - target) * torch.log(1 - output))
    else:
        loss = target * torch.log(output) + (1 - target) * torch.log(1 - output)

    return torch.neg(torch.mean(loss))

The problem is that sometimes this outputs nan or -inf.

Tracing it back I reached the conclusion that sometimes my model outputs very small numbers, for example -136. This leads to:

torch.sigmoid(torch.tensor([-136.])) -> tensor([0.]) which leads to -inf in torch.log().

I am on pyTorch 1.0.1.

Is it ok to use torch.clamp(torch.sigmoid(torch.tensor([-136.])), min=1e-8, max=1 - 1e-8) ?

that function had that problem, in the end I used the NLLLoss.

But it is indeed okay to do that to fix the problem.

Thank you, I tried to use it like this, but it doesn’t work. The network does not train, even when the weights are (0.5, 0.5). Do you have any idea why?

Hello Andrei!

The short (narrow) answer – use LogSigmoid.

Please see my reply to your “Weighted Binary Crossentropy”
post:

Best.

K. Frank

Thank you for your replies, I will mark this as solved.

One thing to note for anyone who might be thinking about implemented custom loss functions in the future: be careful about the numeric stability of your implementation.