Segmentation Dice Loss for Background Class Images (no TP, FP, FN) Possible?

I’ve been training U-Net models with a regular dataset, and one augmented with images that only has background classes.

For those situation with background class image, by definition there can be no TP. The ideal situation is the model predicts TP = 0, FP = 0, TN = 0, and FN = every pixel. If this ideal situation is achieved, the dice loss goes to zero.

BUT, if there’s even a single FP or FN, the dice loss goes to the max value. So there is no ability for the model to learn how to classify every pixel as background pixels. So when I train with images with non-background pixels and images with background pixels only… the loss gets dominated by this effect. Is there a workaround for this?

Wouldn’t an all background input predicted as such yield TN = 1?
If the model predicts another class and the dice loss shoots up (you might need to limit it in case it goes to Inf), the model will get a high signal to predict background only.

Do you mean that the very high dice loss biases the model to only predict all background images?
If so, you could try to clip the loss or just remove these images.

Is there a way to assign weights to dice loss? For example I want foreground predictions to have a higher impact since it is rarer in a dataset. Im having issues where the model simply predicts background for every image after some training time.

If your model only predicts the background class, the intersection part of the dice loss calculation would be zero and thus your dice loss should be high.
I don’t know if there is a clear way to add weighting to it, but you could try to combine the dice loss with e.g. nn.CrossEntropyLoss (and add weights there).

1 Like
  1. You can give higher weight to classes by pretending the pixels where these are the ground truth are there multiple times (i.e. TP and FN are multiplied and also the total number of pixels is adjusted).
  2. So if you have very few “interesting” pixels in a large number of uninteresting ones, you essentially have a heavily imbalanced problem. In my experience, balancing the training data might work even better than applying weights to the losses. When training the U-Net in our book in chapter 13, we take care that we have enough slices with nodules fed into the U-Net. A very brief discussion is in section 13.5.5 (Designing our training and validation data).

Best regards



Suppose you have C classes as well as background, you can model the background as a separate class, then you will have C+1 classes, and TP = every pixel for images that contain only background.