Pos_weight attribute in BCEWithLogitLoss

chinmay5 · May 31, 2021, 1:33pm

I am attempting to work on the following loss function. To my understanding, this is essentially a BCE loss function where we need to work with the weight parameter. I initially started with

Counting the number of positive examples and then weight = total_samples / number_positive_per_class. However, clearly this is not what the paper suggests.
I read through the documents and it seems I need to set the pos_weight parameter. However, there is nothing to set the neg_weights as might be needed in this case. So, can I use the ratio of pos_weight = neg_samples/ pos_samples
If this should be the way, can I use the direct ratio or I need to do something else in order to ensure the loss values are correctly weighted?

Any help would be highly appreciated. Thank you

ptrblck · June 1, 2021, 6:32am

I think you could transform the posted formula into the pos_weight formula used in nn.BCEWithLogitsLoss as given in the docs.

As described, pos_weight is specified as nb_neg/nb_pos and is multiplied with the “positive” part of the loss (left summand). If you divide both summands by lambda_0 in your formula, you would end up with the same loss formulation.

chinmay5 · June 1, 2021, 7:58am

Thank you so much for the reply. Here is a followup though. I see there is also an implementation for the function MultiLabelSoftMarginLoss. I think in spirit it is similar to the BCEWithLogitLoss and since I have a multi label classification, I thought it would be better to use the MultiLabelSoftMarginLoss function. However, there is no pos_weight parameter in it. Is there a way I can use it or should I stick to BCEWithLogitLoss.

Also, just to be sure, should I use pos_weight = neg_samples_of_each_class/ pos_samples_of_each_class in the loss formulation?

ptrblck · June 1, 2021, 8:04am

I don’t know how MultiLabelSoftMarginLoss would compare to BCEWithLogitsLoss, so let’s wait for some experts to chime in.

In case you are dealing with a multi-label classification, pos_weight should get a value for each class, i.e. it should be a tensor containing nb_classes values defined as [nb_neg_class0/nb_pos_class0, nb_neg_class1/nb_pos_class1, ...].

chinmay5 · June 1, 2021, 8:10am

Thank you for the quick response. Just to confirm, even if I use only the BCEWithLogitLoss, it should be fine in the Multi-label scenario. I just want to make sure that the loss I am computing in this way, is not incorrect.

ptrblck · June 1, 2021, 8:13am

Yes, nn.BCEWithLogitsLoss can be used for binary and multi-label classification use cases.