New pytorcher here. I am trying to implement a binary semantic segmentation approach using UNET.
To get myself started I took some code from this example.
My dataset contains some 250 images and I used rotations to expand that to nearly 1000.
The model achieves high accuracy very quickly, after just a couple of epochs.
All prediction masks are blank. The more epochs the blanker they are.
After a certain amout of reading, I believe the problem is that the targets in my images are fairly small.
The model thus gets a high accuracy for predicting zeroes for everything, thus essentially training itself to paint everything black.
The best solution appears to be:
Way more training epochs
A weighted scoring approach where positive pixels are favoured.
Any thoughts on this analysis and hints on how to achieve the weight scoring would be appreciated, I’m still very new to all this.
The blank predictions can indeed be produced by a highly imbalanced target distribution in the mask images and you could try to use a weighted loss function to counter this effect.
I’m unsure about the last effect of flipped predictions and then going to a blank prediction again.