Adding a constant to the loss function

Hello! I need to write a slightly modified percentage error loss function, with a threshold for the denominator. The part of code relevant to it looks like this:

threshold = 290000

def exp_rmspe(pred, target):
    loss = torch.abs(target - pred)/np.maximum(target,threshold)
    return loss.mean()

I have a batch size of 128. When I am using this, I get this error

RuntimeError: bool value of Variable objects containing non-empty torch.cuda.ByteTensor is ambiguous

How can I add that threshold constant in the loss function, without getting that error? Thank you!


Is this error coming from this function? Also to avoid any issue, I would advise against mixing numpy and torch functions, you can use torch.max().

Hi, sorry for reviving this thread after a year.

I have a similar question about adding a constant in the loss function: let’s suppose I want to multiply the final loss by a factor, and that factor is computed based on the inputs or the outputs within a batch.

In my case I’m working with triplet networks and a custom loss, and now I would like to add a scale factor based on the amount of ‘easy negatives’, for example. This is part of my loss function:

def forward(self, x, labels):

        l2_norm_positive = F.pairwise_distance(x[0], x[1],2)
        l2_norm_negative = F.pairwise_distance(x[0], x[2],2)
        easy_negatives = l2_norm_negative >= (l2_norm_positive + self.margin)
        easy_negatives_percent = torch.sum(easy_negatives).float()/labels[0].shape[0]
        scale_factor = 1/(1-easy_negatives_percent+self.eps)

If I remove the part where it computes the scale factor I get normal gradients after calling .backward(), but if I leave that section the gradients become nan.

How could I do that withouth breaking the computation graph? How should I deal with that constant (scale_factor)? Should I use .detach()?

Thanks in advance

If you are getting nan, it means that you’re most likely dividing by 0 somewhere in your function. Can you check the different values to confirm that?

You can use .detach() but that will have a different effect: No gradient will flow back from the scale_factor. It depends on what you want to do here.