Multi-label model outputs only negative values while trained with BCEWithLogitsLoss()

KFrank · November 19, 2020, 9:18pm

Hi Amir -

amirhf:

target = 
tensor([[0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
             0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
             0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0.]])

The short answer is that your dataset is “unbalanced,” so you
should try using the pos_weight argument when you construct
your BCEWithLogitsLoss loss criterion.

Looking at your target, and naively assuming that all positive
class labels appear about equally frequently, it appear that any
giving class will be labelled positive only once in about every
nine images.

So your classifier could do a pretty good job by just predicting
negative for all of your classes all of the time.

It is the purpose of the pos_weight argument to address this
issue by weighting the less-frequent positive samples more heavily
than the more-frequent negative samples. Doing so will penalize
a model that simply tries to predict negative all the time.

It’s quite likely that some classes have positive labels more often
than others. pos_weight takes a vector of per-class weights so
that you can reweight each class separately. A common (and
reasonable) choice for the class weights is:

weights[i] = total_samples / number_of_positive_class_i_samples[i]

Best.

K. Frank