How to convert RGB images with many different colors (not only red, green, blue) into classes for segmentation training?, The mask is linked below

thejas_karkera · January 27, 2021, 6:00am

ptrblck · January 28, 2021, 9:02am

This code shows an example of the transformation from colors to class indices.

thejas_karkera · January 28, 2021, 10:07am

Thankyou so much for the reply!, i was also wondering how do i pass these values to nn.CrossEntropy function. Won’t i get an dimensional mismatch?
And also wanted to know after getting the mask from your code, can i directly pass these masks from the get_item() function?

ptrblck · January 28, 2021, 10:24am

Yes, you should be able to transform the color masks to the mask targets containing indices and return them in the __getitem__.

No, nn.CrossEntropyLoss won’t raise an error, if you pass the model outputs in the shape [batch_size, nb_classes, height, width] and the masks as [batch_size, height, width] for a multi-class segmentation use case.

thejas_karkera · January 28, 2021, 11:01am

Thanks for the reply again. I’m clear how to pass the arguments to nn.CrossEntropyLoss but im not clear how do i pass the masks and images.
What values should i return in my ‘get__item’. Can you help me out with this, im new to pytorch.

thejas_karkera · January 28, 2021, 11:13am

In the linked code actually we have to specify num of classes, i.e in my case number of different colors. I don’t have that information with me. So, is there any other solution?

ptrblck · January 28, 2021, 7:20pm

That’s a rather uncommon use case. How would you define the last output layer, if the number of classes is unknown?
Are you planning to add an “unknown” class category which the model should use for all additional classes?

thejas_karkera · January 28, 2021, 11:51pm

I’m still working on it, I was thinking ‘Number of Unique Colors= Number of classes’.
Or am I wrong? Is there a different way I could train using this dataset.

ptrblck · January 29, 2021, 12:52am

Yes, usually the number of all unique colors in the dataset would correspond to the number of classes.

As explained before, you could try to add “new/unknown” colors to a special “unknown” class category, but it really depends on your use case and what you are trying to achieve.

thejas_karkera · January 29, 2021, 2:13am

Thankyou for the reply. Assuming that there are same number of unique colors in every image in my dataset I will move ahead.
But I’m still not clear how can I pass the mask, after using the code.
Can I know exactly what the code will return. For eg: Shape Or Content.

ptrblck · January 29, 2021, 7:14am

The __getitem__ should return an input tensor in the shape [channels, height, width] and a mask tensor in the shape [height, width] containing class indices in the range [0, nb_classes-1].
The DataLoader will then add the batch dimension to these tensors, such that the input tensor will have a shape of [batch_size, channels, height, width] while the mask [batch_size, height, width].

thejas_karkera · January 29, 2021, 8:21am

Thankyou for the reply, helped me a lot!. Get back to you if something goes wrong while implementing this.

thejas_karkera · January 30, 2021, 1:17pm

Hello Sir thankyou for all the help, is there a code to get the number of unique colors in an image

ptrblck · January 31, 2021, 7:19am

Yes, my linked code snippet gets all unique colors from the image.

[...]
# Get color codes for dataset (maybe you would have to use more than a single
# image, if it doesn't contain all classes)
target = torch.from_numpy(target)
colors = torch.unique(target.view(-1, target.size(2)), dim=0).numpy()
[...]

thejas_karkera · February 1, 2021, 4:11am

Ignore the previous error, I’m using the linked code as shown above, Im using your code as below and im getting an error:"The shape of the mask [3, 1000] at index 0 does not match the shape of the indexed tensor [1000, 1000] at index 0"
**What are the values of h and w i should use, i Highly doubt im using the code properly
**

def getitem(self, idx):

    # load images ad masks

    img_path = os.path.join(self.root, "original_images", self.imgs[idx])

    mask_path = os.path.join(self.root, "col", self.masks[idx])

    img = Image.open(img_path).convert("RGB")

    # note that we haven't converted the mask to RGB,

    # because each color corresponds to a different instance

    # with 0 being background

    h, w = 1000, 1000

             

    mask = Image.open(mask_path)

    #mask = np.array(mask)

    # Create mapping

Get color codes for dataset (maybe you would have to use more than a single

image, if it doesn’t contain all classes)

    #target = torch.from_numpy(mask)

    target=self.transform(mask) #converts to tensor, resize to 1000x1000

    colors = torch.unique(target.view(-1, target.size(2)), dim=0).numpy()

    target = target.permute(2, 0, 1).contiguous()

    mapping = {tuple(c): t for c, t in zip(colors.tolist(), range(len(colors)))}

    mask = torch.empty(h, w, dtype=torch.long)

    for k in mapping:

# Get all indices for current class

      idx = (target==torch.tensor(k, dtype=torch.uint8).unsqueeze(1).unsqueeze(2))

      validx = (idx.sum(0) == 3)  # Check that all channels match

      mask[validx] = torch.tensor(mapping[k], dtype=torch.long)
      return self.transform(img), (mask)

thejas_karkera · February 1, 2021, 6:16am

Hello sir is there any example code for this? im getting a ““1only batches of spatial targets supported (3D tensors) but got targets of size: : [1, 1000, 1000, 3]”” error

ptrblck · February 1, 2021, 6:20am

You could compare my code example to yours and try to find the difference or are you also seeing this error using my code snippet?
If you cannot find the issue, feel free to post an executable code snippet which reproduces the error.

thejas_karkera · February 1, 2021, 6:46am

This is how i usedyour code:
I dont have that error anymore, but is the proper way to use your code?
“This is the current error”
“ValueError: Expected target size (1, 480), got torch.Size([1, 3, 480])”

“Used nn.CrossEntropyLoss(), num_classes=256(randomly initialized, bcz i was getting the len(colors) as 2992”

class Cus_dataset(torch.utils.data.Dataset):

def __init__(self, root,transform,transformm):

    self.root = root

    self.transform =  transform

    self.transformm = transformm

    # load all image files, sorting them to

    # ensure that they are aligned

    self.imgs = list(sorted(os.listdir(os.path.join(root, "original_images"))))

    self.masks = list(sorted(os.listdir(os.path.join(root, "col"))))

                  

def __getitem__(self, idx):

    # load images ad masks

    img_path = os.path.join(self.root, "original_images", self.imgs[idx])

    mask_path = os.path.join(self.root, "col", self.masks[idx])

    img = Image.open(img_path).convert("RGB")

    # note that we haven't converted the mask to RGB,

    # because each color corresponds to a different instance

    # with 0 being background

                    

    mask = Image.open(mask_path)

    mask = np.array(mask)

    #colors = np.unique(mask)

    #mask =self.transform(mask)

    def mask_to_class(mask):

      target = torch.from_numpy(mask)

      h,w = target.shape[0],target.shape[1]

      masks = torch.empty(h, w, dtype=torch.long)

      colors = torch.unique(target.view(-1,target.size(2)),dim=0).numpy()

      target = target.permute(2, 0, 1).contiguous()

      mapping = {tuple(c): t for c, t in zip(colors.tolist(), range(len(colors)))}

      for k in mapping:

        idx = (target==torch.tensor(k, dtype=torch.uint8).unsqueeze(1).unsqueeze(2))

        validx = (idx.sum(0) == 3) 

        masks[validx] = torch.tensor(mapping[k], dtype=torch.long)

      

    masks = mask_to_class(mask)
   return masks

ptrblck · February 1, 2021, 7:33am

Since you don’t know the number of classes, I think you would have to estimate it.
Do 2992 unique colors look right for the processed dataset?

What shape does the model output have?
If you are working on a multi-class segmentation use case, the output should have the shape [batch_size, nb_classes, height, width], while the target should have the shape [batch_size, height, width] and contain class indices in the range [0, nb_classes-1].

thejas_karkera · February 1, 2021, 8:02am

My input image size is 480x480x3, if i assume it has 256 classes, then my last layer of my model will have a channel size 256.