How to convert a 3 channel ground truth mask image into single channel mask for the U-Net Model

I am having a 3 channel image, in which each channel has some information encoded in the form of colors. I want to convert it to a single-channel image with all the information retained. When I convert it into 1 channel (using grayscale) I lose all that color information and get a tensor with zero values and visualizing this image show a totally black image.

So, is there any way to change the 3 channel label/mask image to 1 channel image but not grayscale?

My professor ask me to do this…

Waiting for your responses asap

When I convert it into 1 channel (using grayscale) I lose all that color information and get a tensor with zero values and visualizing this image show a totally black image.

Could you show a code snippet that shows us why you get a black image?

As far as I know you have two options:

  1. Convert to grayscale
  2. Adapt the U-Net to accept multiple channels by changing the network architecture.

Thanks for your response.
Actually i am getting my ground truth image with three channels when i read it and make tensors for further processing. When i convert my ground truth to gray scale it shows zero. In reality it should have three classes in it. but reading the three channels will show me accurately three classes.

Note: I am facing it on ground truth images. Else the code works well. Also if there is a model which accept 3 channel ground truth image then suggest me please.

I don´t quite understand your ground-truth image format. Could you please show us a small sample of your code to be able to help you further?

So here is the sample of code I am working on PANDA Kaggle competition dataset. I have read the tiff images using skimage.mulitimage library. So after making data loaders here is the code snippet

train_dataset = TrainDataset(X_train, y_train, transform= None) 
valid_dataset = TrainDataset(X_valid, y_valid, transform= None) 
train_loader = DataLoader(train_dataset, batch_size=4, num_workers = 4)
valid_loader = DataLoader(valid_dataset, batch_size=4, num_workers = 4)

asdf

So here in this image, you can see the shape of the image and their ground truth values.

Ok so from the PANDA description I assume, that each of the 3 channels corresponds to indivdual classes and the channels are binary. You would need to adapt the U-Net number of classes in order to predict the three channels. Then something like a Dice-Loss is probably a good way to calculate the loss.

1 Like

Thanks for your time. I was also thinking the same as you suggest.
But My professor suggest me to make a single channel which is confusing to me. So can you suggest a UNet model which accept 3 channel image and 3 channel label masks. So that i can go with this data as it is without any changing.

Sure, this seems to me like a solid implementation of the U-Net. You can adjust the number of output channels using n_classes and the input channels using n_channels.