Convert rgb to gray

Hi, all
I’m trying to use SVHN dataloader from this link.(pytorch SVHN)
https://github.com/pytorch/vision/blob/master/torchvision/datasets/svhn.py

and I want to convert RGB to gray.
Can I do this with options in DataLoader like num_workers, sampler, …? or How can I do this?

1 Like

One simple approach would be to override the __getitem__ method from torchvision.datasets.SVHN and convert the image to grayscale using PIL.Image.convert('LA'):


class MyDataset(datasets.SVHN):
    def __init__(self,  root, split='train',
                 transform=None, target_transform=None, download=False):
        super(MyDataset, self).__init__(
            root, split, transform, target_transform, download)
    
    def __getitem__(self, index):
        """
        Args:
            index (int): Index
        Returns:
            tuple: (image, target) where target is index of the target class.
        """
        img, target = self.data[index], int(self.labels[index])

        # doing this so that it is consistent with all other datasets
        # to return a PIL Image
        img = Image.fromarray(np.transpose(img, (1, 2, 0)))
        # Convert to grayscale
        img = img.convert('LA')
        if self.transform is not None:
            img = self.transform(img)

        if self.target_transform is not None:
            target = self.target_transform(target)

        return img, target
2 Likes

Thanks for reply.
but this give me 2 rgb channel.
How can make rgb channel to 1 ?

use this:
torchvision.transforms.functional.to_grayscale

@ptrblck why we need to do a transpose here img = Image.fromarray(np.transpose(img, (1, 2, 0))) ?

I need to convert images to the grayscale, and I did the same but I’ve got this error File "C:\Users\Neda\Anaconda3\lib\site-packages\PIL\Image.py", line 2463, in fromarray raise TypeError("Cannot handle this data type") TypeError: Cannot handle this data type
I am doing like this, am I doing a mistake?

      
class CustomDataset(Dataset):
    def __init__(self, image_paths, target_paths):   # initial logic happens like transform

        self.image_paths = image_paths
        self.target_paths = target_paths
        #self.convert_gray = transforms.Grayscale(1)
        self.transforms = transforms.ToTensor()
        self.mapping = {
            0: 0,
            255: 1
#            85: 0,
#            170: 1,
#            255: 2               
        }
    def mask_to_class(self, mask):
        for k in self.mapping:
            mask[mask==k] = self.mapping[k]
        return mask
    
    def __getitem__(self, index):

        image = Image.open(self.image_paths[index])
        mask = Image.open(self.target_paths[index])
        #t_image = self.convert_gray(image)
        img = Image.fromarray(np.transpose(image, (1, 2, 0)))
        t_image = img.convert('LA')
        t_image = self.transforms(t_image)
        #mask = torch.from_numpy(np.array(mask))    #this is for BMCC dataset
        mask = torch.from_numpy(np.array(mask, dtype=np.uint8)) # this is for my dataset(lv)
        mask = self.mask_to_class(mask)
        mask = mask.long()
        return t_image, mask

    def __len__(self):  # return count of sample we have

        return len(self.image_paths)

We need to transpose the images in my example, as the data stored in the SVHN dataset is already in the “right” format ([c, h, w]). Since we are using PIL to convert them to grayscale, we need to pass the array as [h, w, c].

It won’t be necessary in your example, because you are already loading your data as PIL.Images.
Could you comment the img = Image.fromarray line and try it again?

1 Like

@ptrblck thank you. I did that but now does have 2 channels RuntimeError: Given groups=1, weight of size [5, 1, 3, 3], expected input[1, 2, 240, 320] to have 1 channels, but got 2 channels instead

t_image = image.convert('LA') suppose to convert to grayscal image? Is that right?

Yes, your image will be converted to a grayscale image, but an additional alpha channel will be added.
You could try to use convert('L') instead.
Also, note that you might have to unsqueeze the channel dimension after converting it to grayscale.

2 Likes

Yes, it works. Thanks a lot :slight_smile:

1 Like