Adding classification to segmentation model approach

Hello all,

If you have a model that segments a structure, like kidneys in this case:


What is a good approach for eliminating images that have a few positive pixels but no kidneys? A simple approach is thresholding the number of pixels but I figured I would get better results with a classifier on top of the segmentation model.

In this case, I was thinking of training the segmentation models on images with just kidneys and then training the classifier on the normal distribution of images (with and without kidneys) using the first segmentation model. I am just questioning if I should use the segmentation model as an encoder only, or just take the final output of the segmentation model and add a few dense layers on top?

Thoughts or feedback appreciated!

My model setup for reference:

ENCODER = 'efficientnet-b6'
ENCODER_WEIGHTS = 'imagenet'
DEVICE = 'cuda'
n_class = 2

    model = smp.Unet(
        encoder_name=ENCODER, encoder_weights=ENCODER_WEIGHTS, classes=n_class, activation=ACTIVATION,

Hello Akshay!

My intuition tells me that if your model does a good job of segmenting
the kidney images, then thresholding the count of predicted kidney
pixels should do an essentially perfect job of classifying kidney vs.

But, with the following qualification:

If you train your segmenter on non-kidney images (no kidney pixels)
as well as kidney images, and it segments non-kidney images
successfully, predicting no, or just a few, kidney pixels, thresholding
should work very well.

But if you only train your segmenter on kidney images, I could see
the segmenter becoming strongly biased to predict that an image
has many kidney pixels (because all of your training images do),
and therefore predicting a lot of kidney pixels in non-kidney images.

In this case, I was thinking of training the segmentation models on images with just kidneys

Regardless of which path you take, I would lean against training only
with images of kidneys. As a general rule, you’re better off training with
the full spectrum of data your model will be seeing in its final application
(if you can).

I do think post-training a classifier on top of your pre-trained segmenter
makes sense. I don’t have intuition about whether you should graft your
classifier onto you segmenter at the “encoder” level or at the final layer.

One thing you might want to experiment with is training just your
classifier weights vs. training all of the weights, including those that
you had pre-trained for the segmenter. In the latter case, it might
make sense to first freeze your segmenter weights while you partially
train your classifier weights to get them in the right ball park, and then
fine tune your full-model weights, training both the segmenter and
classifier weights jointly.

(Of course if it were me, I would go with thresholding. Do you have
any examples where thresholding misclassifies?)

Good luck.

K. Frank