Would it be possible to give more punishment to false negative mistakes?

I’ m doing a multi-label classification problem of 15 classes. The net I’ m using is a Resnet50 with the last layer’s output to be 11. The loss function is BCEWithLogitsLoss.

I have around 12000 labeled pictures. However, the label matrix is sparse. That is to say for every picture, maybe only 1 to 2 classes will be active. Also there are a few pictures labeled all zeros.

First when I trained, the net tended to give me all zero result, which seems to be a local minimum.

Then I drop these all zero samples and use pos_weight parameters in the BCEloss to balance different classes. But the net still tends to give me the same result. (all zero predictions)

I wonder if there’s any method to give false negative mistakes more punishment which may help the net to predict more positive result.

Thanks a lot!

Hi Hefan!

This seems odd. For a 15-class classifier, I would expect your last
layer to produce 15 values (not 11).

In this case I would expect the output of your model (the input to
BCEWithLogitsLoss) to have shape [nBatch, nClass = 15], and
your labels (the target) to have the same shape. The output of
your model should be logits, that is, the output of the last Linear
layer of your network, not passed through any non-linear activation
function. Your target values will typically be 0 or 1 (although they
could be probabilities that range from 0.0 to 1.0).

Training with pos_weight (when greater than 1.0) should, indeed,
cause your trained network to produce more positive predictions. If
you make pos_weight large enough I would expect your network to
be able to make positive predictions (but, of course, possibly false
positive predictions).

Can you successfully overfit your network by training it on a small
subset of you training set so that it produces (nearly) perfect results
on that subset, and therefore produces positive predictions?


K. Frank