Grayscale to RGB transform

Some of the images I have in the dataset are gray-scale, thus, I need to convert them to RGB, by replicating the gray-scale to each band. I am using a transforms.lambda to do that, based on torch.cat. However, this seems to not give the expected results
Example: Let xx be some image of size 28x28, then,

In [67]: xx.shape
Out[67]: torch.Size([28, 28])

In [68]: y =torch.cat([xx,xx,xx],0)

In [69]: y.shape
Out[69]: torch.Size([84, 28])

I expected the size to be [28, 28, 3]. Any suggestions how to resole this?

Here’s the transform I am trying:

transform = transforms.Compose([
                transforms.ToTensor(), 
                transforms.Lambda(lambda x: torch.cat([x, x, x], 0)),
                transforms.Normalize(mean, std),
                 ])
4 Likes

While loading your images, you could use Image.open(path).convert('RGB') on all images.
If you are using ImageFolder, this functionality should be already there using the default loader.
Alternatively, you could repeat the values:

x = torch.randn(28, 28)
x.unsqueeze_(0)
x = x.repeat(3, 1, 1)
x.shape
> torch.Size([3, 28, 28])
30 Likes

I am using it with with MNIST, and I am using datasets.MNIST dataloader.
Not sure however how to call the conversion Image.open(path).convert('RGB'), as it is already there as you noted.

The MNIST dataset doesn’t convert the images to RGB, but to a grayscale image. Have a look at this line of code.

I assume you are using the MNIST data with another color image set?
If so, you could check in __getitem__, if it’s already a color image, and if not use my second approach to convert it.

If you replace y =torch.cat([xx,xx,xx],0) with y =torch.stack([xx,xx,xx],2) it works :slight_smile:

import torch
xx = torch.rand(28,28)
y =torch.stack([xx,xx,xx],0)
print(xx.shape)
print(y.shape)
print(torch.norm(y[:,:,0] - xx))

Thanks Karan. For some reason, the statement that get things done was the one that ptrblck suggested:

transforms.Lambda(lambda x: x.repeat(3, 1, 1) )

4 Likes

You can also convert a 2D grayscale image to a 3D RGB one by doing:

img = img.view(width, height, 1).expand(-1, -1, 3)

Calling .repeat will actually replicate the image data (taking 3x the memory of the original image) whereas .expand will behave as if the data is replicated without actually doing so. Thus .expand is probably better unless you want to change the channels independently of each other.

4 Likes

That’s nice!
How about speed/performance, Repeat vs Expand?

A few of my files are grayscale, but most are jpeg RGB.
Why does the following not work?

im = PIL.Image.open(img_path)
im.convert(“RGB”)
im_torch = torchvision.transforms.ToTensor()(im)

Just like the suggestion above, I need to add

if im_torch.shape[0]==1:
print(f"im_torch.shape={im_torch.shape}“) # im_torch.shape=torch.Size([1, 4077, 4819])
im_torch = im_torch.expand(3,-1,-1)
print(f"im_torch.shape={im_torch.shape}”) # im_torch.shape=torch.Size([3, 4077, 4819])

notice the output of the first print statement is

im_torch.shape=torch.Size([1, 4077, 4819])

So does im.convert(“RGB”) not convert the file?

You missed storing the converted image:

im = im.convert('RGB')

Else, you can try this:

class NoneTransform(object):
    ''' Does nothing to the image. To be used instead of None '''
    
    def __call__(self, image):       
        return image
    

im = PIL.Image.open(img_path)

T = transforms.Compose([
            transforms.ToTensor(),            
            transforms.Lambda(lambda x: x.repeat(3, 1, 1))  if im.mode!='RGB'  else NoneTransform()                 
            ])    

im_torch = T(im)

NB. As far as I remember, None would not work if used instead of NoneTransform()

2 Likes

Thank you. How silly of me.

No worries. This happens to everyone. We are used to OOP, and thus, we expect that im.convert('RGB') does the job.

How does converting gray scale to rgb work?

Does it duplicate the channel?

Depending on what you want to do. Above the channels are replicated. If you want to to colorize grayscale images, then you need to use some colorization algorithms.

1 Like

transforms.Grayscale(3)

2 Likes

I don’t now if this is something wrong with pillow. But I recognized, that using the convert method from pillow it looses all information from the loaded int32 grayscale image and sets all values to 255. I can confirm that the entropy of the image was definitely higher before I converted the image to “RGB”. Stacking the image by hand is working but results in problems for the image transformations I want to apply.

I guess you are converting the image array from int32 to uint8, so the clipping would be expected.
From the mode docs:

RGB (3x8-bit pixels, true color)

yes you are correct, any Idea how to convert from int32 to uint8 without clipping? because my images are always get loaded as int32. The mode of the images is set to “I” which results from the docs as int32 pixels.

excuse me will the result be the same. I mean if i used convert('RGB') or repeat the values of grayscale will be the same

That won’t be possible as int32 is using 32 bits and has a wider range thanuint8 using 8 bits.
Unless you make sure the original int32 image doesn’t have values <0 and >255 you would clip them.

Yes, this should be the case:

img = transforms.ToPILImage()(torch.randint(0, 256, (224, 224)).byte())
img_rgb = img.convert("RGB")

img_rgb_arr = np.array(img_rgb)
img_arr = np.array(img)

for c in range(img_rgb_arr.shape[2]):
    print(c, np.all(img_arr == img_rgb_arr[:, :, c]))
# 0 True
# 1 True
# 2 True

print((torch.from_numpy(img_rgb_arr) == torch.from_numpy(img_arr).unsqueeze(2).expand(-1, -1, 3)).all())
# tensor(True)
1 Like