Grayscale to RGB transform

While loading your images, you could use'RGB') on all images.
If you are using ImageFolder, this functionality should be already there using the default loader.
Alternatively, you could repeat the values:

x = torch.randn(28, 28)
x = x.repeat(3, 1, 1)
> torch.Size([3, 28, 28])

I am using it with with MNIST, and I am using datasets.MNIST dataloader.
Not sure however how to call the conversion'RGB'), as it is already there as you noted.

The MNIST dataset doesn’t convert the images to RGB, but to a grayscale image. Have a look at this line of code.

I assume you are using the MNIST data with another color image set?
If so, you could check in __getitem__, if it’s already a color image, and if not use my second approach to convert it.

If you replace y[xx,xx,xx],0) with y =torch.stack([xx,xx,xx],2) it works :slight_smile:

import torch
xx = torch.rand(28,28)
y =torch.stack([xx,xx,xx],0)
print(torch.norm(y[:,:,0] - xx))

Thanks Karan. For some reason, the statement that get things done was the one that ptrblck suggested:

transforms.Lambda(lambda x: x.repeat(3, 1, 1) )


You can also convert a 2D grayscale image to a 3D RGB one by doing:

img = img.view(width, height, 1).expand(-1, -1, 3)

Calling .repeat will actually replicate the image data (taking 3x the memory of the original image) whereas .expand will behave as if the data is replicated without actually doing so. Thus .expand is probably better unless you want to change the channels independently of each other.


That’s nice!
How about speed/performance, Repeat vs Expand?

A few of my files are grayscale, but most are jpeg RGB.
Why does the following not work?

im =
im_torch = torchvision.transforms.ToTensor()(im)

Just like the suggestion above, I need to add

if im_torch.shape[0]==1:
print(f"im_torch.shape={im_torch.shape}") # im_torch.shape=torch.Size([1, 4077, 4819])
im_torch = im_torch.expand(3,-1,-1)
print(f"im_torch.shape={im_torch.shape}") # im_torch.shape=torch.Size([3, 4077, 4819])

notice the output of the first print statement is

im_torch.shape=torch.Size([1, 4077, 4819])

So does im.convert(“RGB”) not convert the file?

You missed storing the converted image:

im = im.convert('RGB')

Else, you can try this:

class NoneTransform(object):
    ''' Does nothing to the image. To be used instead of None '''
    def __call__(self, image):       
        return image

im =

T = transforms.Compose([
            transforms.Lambda(lambda x: x.repeat(3, 1, 1))  if im.mode!='RGB'  else NoneTransform()                 

im_torch = T(im)

NB. As far as I remember, None would not work if used instead of NoneTransform()


Thank you. How silly of me.

No worries. This happens to everyone. We are used to OOP, and thus, we expect that im.convert('RGB') does the job.

How does converting gray scale to rgb work?

Does it duplicate the channel?

Depending on what you want to do. Above the channels are replicated. If you want to to colorize grayscale images, then you need to use some colorization algorithms.

1 Like



I don’t now if this is something wrong with pillow. But I recognized, that using the convert method from pillow it looses all information from the loaded int32 grayscale image and sets all values to 255. I can confirm that the entropy of the image was definitely higher before I converted the image to “RGB”. Stacking the image by hand is working but results in problems for the image transformations I want to apply.

I guess you are converting the image array from int32 to uint8, so the clipping would be expected.
From the mode docs:

RGB (3x8-bit pixels, true color)

yes you are correct, any Idea how to convert from int32 to uint8 without clipping? because my images are always get loaded as int32. The mode of the images is set to “I” which results from the docs as int32 pixels.

excuse me will the result be the same. I mean if i used convert('RGB') or repeat the values of grayscale will be the same

That won’t be possible as int32 is using 32 bits and has a wider range thanuint8 using 8 bits.
Unless you make sure the original int32 image doesn’t have values <0 and >255 you would clip them.

Yes, this should be the case:

img = transforms.ToPILImage()(torch.randint(0, 256, (224, 224)).byte())
img_rgb = img.convert("RGB")

img_rgb_arr = np.array(img_rgb)
img_arr = np.array(img)

for c in range(img_rgb_arr.shape[2]):
    print(c, np.all(img_arr == img_rgb_arr[:, :, c]))
# 0 True
# 1 True
# 2 True

print((torch.from_numpy(img_rgb_arr) == torch.from_numpy(img_arr).unsqueeze(2).expand(-1, -1, 3)).all())
# tensor(True)
1 Like

perhaps you can check that. The question was basically about a function that can be used as part of transforms.Compose.