For Defect Segmentation I have trained a distinct segmentation network for each defect. I.e. 20 classes -> 20 models where the output of each model is a sigmoid activated segmentation-map of shape Bx1xHxW.
Now I want to fuse all predictions in a single segmentation map of shape Bx20xHxW.
Here the main issue is, that the network predictions are always close to 1 (0.97-0.999), even for false positives which makes a pixel-wise class decision impossible.
I already tried to use negative examples from other classes during training to lower the confidence score in case of false positives but this didn’t change much.
Any Ideas how I can improve my training procedure such that the network outputs are more evenly distributed between 0-1? The single-network-per-class unfortunately is a requirement.