In my training dataset I have many more negative instances (0 label) than positives (1 label).
So seems like my model is being heavily biased towards predicting low probabilities. How do I rescale weights?
As output I have an array of probabilities e.g [0.1, 0.3, 0.8] and an array of labels [0, 1, 0]
I would like to try diff loss functions, such as MSE or BCE. Maybe some others.
What would be the most appropriate loss function here that allows giving bigger weight towards positive instances?
And how would I do it? I see that some loss functions have weight attribute, but I am not sure how to set it properly in my case.
Or for example BCEwithLogits has pos_weight attribute. It says “Must be a vector with length equal to the number of classes.”
So if my input to a loss function is [0.1, 0.3, 0.8] , [0, 1, 0] what should be my pos_weight? Smth like [0.1, 0.9] ?
The list with [0, 1, 0] is not a list of classes but just a list of values for a single class. If you have only one class,
pos_weight should contain only one argument. For example, if you have 300 positive samples and 200 negative ones, you should set
pos_weight to 200 / 300. It will virtually “reduce” the set of positive samples from 300 to 200.
An alternative is to sample from negative subset with higher frequency.
Hmm in last version of pytorch i got errors that pos_weight is not present. So I used weights argument where zero instances have weight of 1 and positive have weight of pos/negs. I think it should be correct…
pos_weight first appeared only in 0.4.1. Try to update your PyTorch installation.
weights is a way to say “negative samples is more important”, whereas
pos_weight is a way to say “negative errors must be larger”. These approaches are different and are not equivalent in the general case.
Could you pls explain how they are different mathematically? The difference is not clear…
In your case with only one class there is no difference.