How to pass weights to BCELoss()

Hello,

I have code like this:

loss_fn = reg_BCELoss(dim=2)

loss = 0
def train(...loss_fn):
    ...
    for i_batch, (samples, labels) in enumerate(TrainDL):
        ...
        loss_batch = loss_fn(labels_pred, labels)
        loss += loss_batch.item()
        ...

Now I’d like to set weights inside my training loop but apparently only the constructor can set weights which means, I’d have to create the loss function object inside the loop. Which means, I’d create thousands of objects.

So, what do I do? Do I try to do something stupid?

1 Like

the weights in the BCELoss could typically be determined before you start your training loop because it usually depends on the label distribution in the whole training data and not in each minibatch.

quote form bceloss.

  • weight (Tensor, optional) – a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch.

do you want to use this feature or you want to have weight for each class?

How do I write latex in this forum?

BCELoss() takes two vectors: x and y. Each of size N. We then have:

l(x, y) = L = {l_1, …, l_N} wheras

l_n = -w_n * [y_n * log(x_n) + (1- y_n) *log(1 - x_n)]

which is linear. Once we got L we make a reduction. This reduction can be the mean or a sum.

This is the reason people apply the BCELoss() batch wise. The reduction is what puts the information from the whole set into a statement.

In short: I can’t see your argument and I can’t see why there is no weights function in the forward function. I mean, what I could do is simple:

  1. Call BCELoss() with reduction=‘none’ and implement my own reduction.

But still, I must be missing something because if my argument would be correct, I could provide w_n to forward().

What do you mean by class? In any case: I don’t want to pass the weights to the constructor but to the forward function. See argument in my other post. The reason for that is simple: I don’t want to go through GB and GB of data and pass several hundred of megabytes of information to one argument. Just seems wrong. :slight_smile:

you can change loss_fn.weight.
also you can use
F.binary_cross_entropy(input, target, weight=weight, reduction=reduction).
F is torch.nn.functional

It feels dirty to me to alter a value set by the constructor. Is that really how python people design such a thing?

You can calculate the loss, and before doing the reduction over the batch, multiply it with the scaling factor, and then do the reduction. Here’s something implemented:

This is for sequence though, you probably have BxC and not BxTxC, but I think the idea is similar.

I know this is old news now - but I’m running into this now and feel your pain. I wonder how anyone uses weights with BCELoss successfully? Needing to know the weights a priori for each element of each batch to pass to the constructor is such a weird constraint to have…

I’m using it like this between each batch on a project:

criterion = nn.BCELoss()
...
# after getting labels
if targets.sum(0) == 0 | (1 - targets).sum(0)== 0:
    continue

criterion.pos_weight = (1 - targets).sum(0) / targets.sum(0)

...

The if statement is to avoid divide by zero and any potential anomolies from the loss function internals.