Hi Amir -
The short answer is that your dataset is “unbalanced,” so you
should try using the pos_weight
argument when you construct
your BCEWithLogitsLoss
loss criterion.
Looking at your target, and naively assuming that all positive
class labels appear about equally frequently, it appear that any
giving class will be labelled positive only once in about every
nine images.
So your classifier could do a pretty good job by just predicting
negative for all of your classes all of the time.
It is the purpose of the pos_weight
argument to address this
issue by weighting the less-frequent positive samples more heavily
than the more-frequent negative samples. Doing so will penalize
a model that simply tries to predict negative all the time.
It’s quite likely that some classes have positive labels more often
than others. pos_weight
takes a vector of per-class weights so
that you can reweight each class separately. A common (and
reasonable) choice for the class weights is:
weights[i] = total_samples / number_of_positive_class_i_samples[i]
Best.
K. Frank