How to pass weights to BCELoss()

pascalm · December 17, 2021, 2:00pm

Hello,

I have code like this:

loss_fn = reg_BCELoss(dim=2)

loss = 0
def train(...loss_fn):
    ...
    for i_batch, (samples, labels) in enumerate(TrainDL):
        ...
        loss_batch = loss_fn(labels_pred, labels)
        loss += loss_batch.item()
        ...

Now I’d like to set weights inside my training loop but apparently only the constructor can set weights which means, I’d have to create the loss function object inside the loop. Which means, I’d create thousands of objects.

So, what do I do? Do I try to do something stupid?

sagsriv · December 17, 2021, 2:04pm

the weights in the BCELoss could typically be determined before you start your training loop because it usually depends on the label distribution in the whole training data and not in each minibatch.

mMagmer · December 17, 2021, 2:15pm

quote form bceloss.

weight (Tensor, optional) – a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch.

do you want to use this feature or you want to have weight for each class?

pascalm · December 17, 2021, 2:16pm

How do I write latex in this forum?

BCELoss() takes two vectors: x and y. Each of size N. We then have:

l(x, y) = L = {l_1, …, l_N} wheras

l_n = -w_n * [y_n * log(x_n) + (1- y_n) *log(1 - x_n)]

which is linear. Once we got L we make a reduction. This reduction can be the mean or a sum.

This is the reason people apply the BCELoss() batch wise. The reduction is what puts the information from the whole set into a statement.

In short: I can’t see your argument and I can’t see why there is no weights function in the forward function. I mean, what I could do is simple:

Call BCELoss() with reduction=‘none’ and implement my own reduction.

But still, I must be missing something because if my argument would be correct, I could provide w_n to forward().

pascalm · December 17, 2021, 2:19pm

What do you mean by class? In any case: I don’t want to pass the weights to the constructor but to the forward function. See argument in my other post. The reason for that is simple: I don’t want to go through GB and GB of data and pass several hundred of megabytes of information to one argument. Just seems wrong.

mMagmer · December 17, 2021, 2:20pm

you can change loss_fn.weight.
also you can use
F.binary_cross_entropy(input, target, weight=weight, reduction=reduction).
F is torch.nn.functional

pascalm · December 17, 2021, 2:26pm

It feels dirty to me to alter a value set by the constructor. Is that really how python people design such a thing?

sagsriv · December 17, 2021, 2:33pm

You can calculate the loss, and before doing the reduction over the batch, multiply it with the scaling factor, and then do the reduction. Here’s something implemented:

github.com

dpressel/mead-baseline/blob/master/layers/eight_mile/pytorch/layers.py#L4693

    
      
              """
              def __init__(self):
                  super().__init__()
                  self.loss = nn.NLLLoss(reduction='none')
          
          
    def forward(self, preds, targets, weights):
                  loss = sum([self.loss(pred, targets[:, i]) for i, pred in enumerate(preds)])
                  weights = weights.type_as(loss)
                  return torch.dot(loss, weights)/len(weights)
          
          
class WeightedSequenceLoss(nn.Module):
              """Weight individual training examples
          
          
    """
              def __init__(self, LossFn: nn.Module = nn.NLLLoss, avg: str = "token"):
                  super().__init__()
                  self.avg = avg
                  self.crit = LossFn(ignore_index=Offsets.PAD, reduction="none")
                  if avg == 'token':
                      self._reduce = self._mean
                  else:

This is for sequence though, you probably have BxC and not BxTxC, but I think the idea is similar.

James_Warner · December 29, 2023, 3:04pm

I know this is old news now - but I’m running into this now and feel your pain. I wonder how anyone uses weights with BCELoss successfully? Needing to know the weights a priori for each element of each batch to pass to the constructor is such a weird constraint to have…

J_Johnson · December 29, 2023, 3:45pm

I’m using it like this between each batch on a project:

criterion = nn.BCELoss()
...
# after getting labels
if targets.sum(0) == 0 | (1 - targets).sum(0)== 0:
    continue

criterion.pos_weight = (1 - targets).sum(0) / targets.sum(0)

...

The if statement is to avoid divide by zero and any potential anomolies from the loss function internals.