# Calculating weighted BCEWithLogitsLoss

I am slightly confused on using weighted BCEWithLogitsLoss. My input data shape is : 1 x 52 x 52 x 52 (3D Volume) and label for each volume is either 0 or 1 and i am using a batch size of 5. So, at each epoch, input is 5 x 1 x 52 x 52 x 52 and label is 1 x 5. The way i am calculating weights is:

``````weight_0 = count_of_lbl1 / (total_lbl_count)
weight_1 = count_of_lbl0 / (total_lbl_count)
``````

My question is should i calculate weights for each batch or per dataset. Also, assuming my label per batch is [0, 0, 1, 0, 1] and weight_0 is 0.4 and weight_1 is 0.6, so does the weight tensor passed to nn.BCEWithLogitsLoss will be [0.4, 0.4, 0.6, 0.4, 0.6] ?

Hello Anil!

I prefer calculating the weight based on the entire dataset,
rather than on a per-batch basis, although it shouldn’t matter
if your batches are of reasonable size. The point is that the
weights aren’t magic numbers that have to be just right. They
are approximate numbers that are used to (partially) account
for having a significantly unbalanced dataset.

This would be reasonable, but the disadvantage is that you would
have to construct a new instance of `BCEWithLogitsLoss` for
every batch, because your `weight` tensor depends on the batch.

I am assuming here that by “weight tensor” you mean the tensor
you pass into `BCEWithLogitsLoss`'s constructor as its named
`weight` argument, e.g.:

``````criterion = torch.nn.BCEWithLogitsLoss (weight = my_weight_tensor)
loss = criterion (preds, targets)
``````

It is more convenient to use the named `pos_weight` constructor
argument:

``````criterion = torch.nn.BCEWithLogitsLoss (pos_weight = torch.FloatTensor ([count_of_lbl0 / count_of_lbl1]))
``````

You can now use the same `criterion` loss object, constructed
once, over and over again for each batch.

I would also note that if your relative weights are, in fact, `0.4`
and `0.6`, your dataset isn’t really very unbalanced, and I probably
wouldn’t bother using weights in the loss function.

As an aside, your shapes look a little confused. I assume that
“5 x 1 x 52 x 52 x 52” is the shape of the input to your model
(not your loss function). The shape of your label is probably ``
(although it could be `[5, 1]`), but a shape of `[1, 5]` would be
wrong.

Best.

K. Frank

1 Like

I see what you mean (took me a while to figure out). so rather than passing the `weight` argument (which is the rescaling weight of each bach element) to the loss function we pass the `pos_weight` argument (which is weight of positive example). In what scenarios would you want to use one or the other ? Also, if i am calculating `pos_weight` argument for each batch, don’t i still have to instantiate the loss function for each batch to pass the `pos_weight` argument. Will something like this work:

``````criterion = torch.nn.BCEWithLogitsLoss()
# Assuming i calculate 'pos_weights' for each batch
weights = torch.FloatTensor ([count_of_lbl0 / count_of_lbl1])
criterion.pos_weight = weights
loss = criterion(output, label)
``````

Hi Anil!

That is correct.

If your sample weight only depends on the class of the sample,
you can use either. I tend to think that `pos_weight` is a little more
convenient.

If your sample weight depends on something other than the sample’s
class – for example, if you’re “hard mining” and want to weight “hard”
samples more heavily – then you would have to use `weight` and
provide per-sample weights.

Also, if you are using the different, but related loss class, `BCELoss`
(which you generally shouldn’t be using), you will have to use `weight`
because, for whatever reason, `BCELoss` doesn’t have a `pos_weight`
argument for its constructor.

Yes (although, as you note below, it does appear to be possible to
modify the `pos_weight` property after `BCEWithLogitsLoss` has
been constructed). But, again, my preference is to use the same
`pos_weight` for the whole dataset, rather than calculate it for each
batch.

I don’t see this discussed in the documentation, but it does appear to
work for me (using pytorch 0.3.0, and `weight` rather than `pos_weight`).

I would probably avoiding doing it this way because I’m not sure that
it’s officially supported.

Best.

K. Frank

1 Like

Sorry for being late.

In my case, 90% of my dataset’s have negative class and 10% of positive class (both train and validation set, binary classification problem)

How woul I use `BCEWithLogitsLoss` or `BCELoss` with theirs paremeters `weights` and/or `pos_weight` to handle my imbalanced dataset’s?

Best regards