Multi-label classification, model learns negative classes

I am trying to build multi-label classification model using pascal-2007 dataset. There are 20 classes. The accuracy of my model first seemed good, but later I found out that model learned to predict values close to 0 for all the classes since binary cross entropy (with logits) awarding predicting absence of classes too, and I suspect that since, for a given image, there is more absent classes than ones that are in the picture, loss reduces when model predict none of them in the picture.

I tried to give more weight to the present classes but it didn’t work well. Everybody seems to be happy with just binary cross entropy losses, is it just me having this problem ? And if it is, can someone explain to me how bce losss overcome this problem, because for me it is logical for it to pick those in this case. Also how can I overcome this ?

I think giving more weight to positive classes can work in general, but there might be reasons it does not work for you. If some classes are very rare, you might look into oversampling them. In the context of detection, people have used focal loss to get around.

Best regards


1 Like

I am not using transfer learning and there is 2k image in training set while there are 20 classes. Can it also result from not enough data ? Maybe if there are more data, model will see more present ones etc…

Well, if your training method required more data than your dataset had and everything else worked, you would see the network reach very good accuracies on the training set and bad on the validation set.

Yes but I suspect in this case I don’t see it because negative class predictions are increasing accuracy anyway