# Calculating weighted BCEWithLogitsLoss

I am slightly confused on using weighted BCEWithLogitsLoss. My input data shape is : 1 x 52 x 52 x 52 (3D Volume) and label for each volume is either 0 or 1 and i am using a batch size of 5. So, at each epoch, input is 5 x 1 x 52 x 52 x 52 and label is 1 x 5. The way i am calculating weights is:

``````weight_0 = count_of_lbl1 / (total_lbl_count)
weight_1 = count_of_lbl0 / (total_lbl_count)
``````

My question is should i calculate weights for each batch or per dataset. Also, assuming my label per batch is [0, 0, 1, 0, 1] and weight_0 is 0.4 and weight_1 is 0.6, so does the weight tensor passed to nn.BCEWithLogitsLoss will be [0.4, 0.4, 0.6, 0.4, 0.6] ?

Hello Anil!

I prefer calculating the weight based on the entire dataset,
rather than on a per-batch basis, although it shouldnâ€™t matter
if your batches are of reasonable size. The point is that the
weights arenâ€™t magic numbers that have to be just right. They
are approximate numbers that are used to (partially) account
for having a significantly unbalanced dataset.

This would be reasonable, but the disadvantage is that you would
have to construct a new instance of `BCEWithLogitsLoss` for
every batch, because your `weight` tensor depends on the batch.

I am assuming here that by â€śweight tensorâ€ť you mean the tensor
you pass into `BCEWithLogitsLoss`'s constructor as its named
`weight` argument, e.g.:

``````criterion = torch.nn.BCEWithLogitsLoss (weight = my_weight_tensor)
loss = criterion (preds, targets)
``````

It is more convenient to use the named `pos_weight` constructor
argument:

``````criterion = torch.nn.BCEWithLogitsLoss (pos_weight = torch.FloatTensor ([count_of_lbl0 / count_of_lbl1]))
``````

You can now use the same `criterion` loss object, constructed
once, over and over again for each batch.

I would also note that if your relative weights are, in fact, `0.4`
and `0.6`, your dataset isnâ€™t really very unbalanced, and I probably
wouldnâ€™t bother using weights in the loss function.

As an aside, your shapes look a little confused. I assume that
â€ś5 x 1 x 52 x 52 x 52â€ť is the shape of the input to your model
(not your loss function). The shape of your label is probably `[5]`
(although it could be `[5, 1]`), but a shape of `[1, 5]` would be
wrong.

Best.

K. Frank

1 Like

I see what you mean (took me a while to figure out). so rather than passing the `weight` argument (which is the rescaling weight of each bach element) to the loss function we pass the `pos_weight` argument (which is weight of positive example). In what scenarios would you want to use one or the other ? Also, if i am calculating `pos_weight` argument for each batch, donâ€™t i still have to instantiate the loss function for each batch to pass the `pos_weight` argument. Will something like this work:

``````criterion = torch.nn.BCEWithLogitsLoss()
# Assuming i calculate 'pos_weights' for each batch
weights = torch.FloatTensor ([count_of_lbl0 / count_of_lbl1])
criterion.pos_weight = weights
loss = criterion(output, label)
``````

Hi Anil!

That is correct.

If your sample weight only depends on the class of the sample,
you can use either. I tend to think that `pos_weight` is a little more
convenient.

If your sample weight depends on something other than the sampleâ€™s
class â€“ for example, if youâ€™re â€śhard miningâ€ť and want to weight â€śhardâ€ť
samples more heavily â€“ then you would have to use `weight` and
provide per-sample weights.

Also, if you are using the different, but related loss class, `BCELoss`
(which you generally shouldnâ€™t be using), you will have to use `weight`
because, for whatever reason, `BCELoss` doesnâ€™t have a `pos_weight`
argument for its constructor.

Yes (although, as you note below, it does appear to be possible to
modify the `pos_weight` property after `BCEWithLogitsLoss` has
been constructed). But, again, my preference is to use the same
`pos_weight` for the whole dataset, rather than calculate it for each
batch.

I donâ€™t see this discussed in the documentation, but it does appear to
work for me (using pytorch 0.3.0, and `weight` rather than `pos_weight`).

I would probably avoiding doing it this way because Iâ€™m not sure that
itâ€™s officially supported.

Best.

K. Frank

1 Like

Sorry for being late.

In my case, 90% of my datasetâ€™s have negative class and 10% of positive class (both train and validation set, binary classification problem)

How woul I use `BCEWithLogitsLoss` or `BCELoss` with theirs paremeters `weights` and/or `pos_weight` to handle my imbalanced datasetâ€™s?

Best regards