Modified PyTorch loss function BCEWithLogitsLoss returns NaNs

I’m trying to solve a binary classification problem ( target=0 and target=1 ) with an exception: Some of my labels are classified as target=0.5 on purpose, and I wish to have zero loss for either classifying it as 0 or 1 (i.e both classes are “correct”).

I tried to implement a custom loss from scratch, based on PyTorch’s BCEWithLogitsLoss:

class myLoss(torch.nn.Module):

def __init__(self, pos_weight=1):
    self.pos_weight = pos_weight

def forward(self, input, target):
    epsilon = 10 ** -44
    my_bce_loss = -1 * (self.pos_weight * target * F.logsigmoid(input + epsilon)
                        + (1 - target) * log(1 - sigmoid(input) + epsilon))
    add_loss = (target - 0.5) ** 2 * 4
    mean_loss = (my_bce_loss * add_loss).mean()
    return mean_loss

epsilon was chosen so the log will be bounded to -100, as suggested in BCE loss.

However I’m still getting NaN errors:

Function 'LogBackward' returned nan values in its 0th output.


Function 'SigmoidBackward' returned nan values in its 0th output.

Any suggestions how can I correct my loss function? maybe by somehow inherit and modify forward function?


If the goal is that these samples are always properly classified, why do you use them in the model at all?
Why not filter them out before the forward based on the target? This way you won’t do useless computations in the forward/backward and will be able to use the vanilla loss function.

Hi @albanD, thank you for replying.
I didn’t mention that, but I’m using a Temporal Convolutional Network since I’m looking for temporal causality between samples, then I shouldn’t remove some of my inputs.

In the meanwhile apparently I solved my problem in a similar manner to what you proposed:
I remove those ‘always true’ samples after doing input=model(x) based on target locations, and then I use the formal BCEWithLogitsLoss. I guess the formal implementation is more robust.

1 Like