How to give positive instances bigger weight?

NadiaMe · August 16, 2018, 5:34pm

In my training dataset I have many more negative instances (0 label) than positives (1 label).
So seems like my model is being heavily biased towards predicting low probabilities. How do I rescale weights?
As output I have an array of probabilities e.g [0.1, 0.3, 0.8] and an array of labels [0, 1, 0]

I would like to try diff loss functions, such as MSE or BCE. Maybe some others.
What would be the most appropriate loss function here that allows giving bigger weight towards positive instances?
And how would I do it? I see that some loss functions have weight attribute, but I am not sure how to set it properly in my case.

Or for example BCEwithLogits has pos_weight attribute. It says “Must be a vector with length equal to the number of classes.”
So if my input to a loss function is [0.1, 0.3, 0.8] , [0, 1, 0] what should be my pos_weight? Smth like [0.1, 0.9] ?
Thanks

velikodniy · August 20, 2018, 11:51pm

The list with [0, 1, 0] is not a list of classes but just a list of values for a single class. If you have only one class, pos_weight should contain only one argument. For example, if you have 300 positive samples and 200 negative ones, you should set pos_weight to 200 / 300. It will virtually “reduce” the set of positive samples from 300 to 200.

An alternative is to sample from negative subset with higher frequency.

NadiaMe · August 21, 2018, 12:56am

Hmm in last version of pytorch i got errors that pos_weight is not present. So I used weights argument where zero instances have weight of 1 and positive have weight of pos/negs. I think it should be correct…

velikodniy · August 21, 2018, 7:59am

pos_weight first appeared only in 0.4.1. Try to update your PyTorch installation.

Note that weights is a way to say “negative samples is more important”, whereas pos_weight is a way to say “negative errors must be larger”. These approaches are different and are not equivalent in the general case.

NadiaMe · August 29, 2018, 7:58pm

Could you pls explain how they are different mathematically? The difference is not clear…

velikodniy · August 29, 2018, 9:04pm

In your case with only one class there is no difference.