I’ve been training U-Net models with a regular dataset, and one augmented with images that only has background classes.
For those situation with background class image, by definition there can be no TP. The ideal situation is the model predicts TP = 0, FP = 0, TN = 0, and FN = every pixel. If this ideal situation is achieved, the dice loss goes to zero.
BUT, if there’s even a single FP or FN, the dice loss goes to the max value. So there is no ability for the model to learn how to classify every pixel as background pixels. So when I train with images with non-background pixels and images with background pixels only… the loss gets dominated by this effect. Is there a workaround for this?
Wouldn’t an all background input predicted as such yield TN = 1?
If the model predicts another class and the dice loss shoots up (you might need to limit it in case it goes to Inf), the model will get a high signal to predict background only.
Do you mean that the very high dice loss biases the model to only predict all background images?
If so, you could try to clip the loss or just remove these images.
Is there a way to assign weights to dice loss? For example I want foreground predictions to have a higher impact since it is rarer in a dataset. Im having issues where the model simply predicts background for every image after some training time.
If your model only predicts the background class, the intersection part of the dice loss calculation would be zero and thus your dice loss should be high.
I don’t know if there is a clear way to add weighting to it, but you could try to combine the dice loss with e.g.
nn.CrossEntropyLoss (and add weights there).
Suppose you have C classes as well as background, you can model the background as a separate class, then you will have C+1 classes, and TP = every pixel for images that contain only background.