PyTorch RGB_TO_GRAY

Sherzod_Bek · July 11, 2019, 4:52am

Hello,
I have 10 input color-images ([10, 3, 224, 224]), I want to convert to gray image in PyTorch…

torch.Size([10, 3, 224, 224]) → torch.Size([10, 224, 224])

How can I do that, should I chose one channel from RGB or is there any function to convert RGB to GRAY…?

Filos92 · July 11, 2019, 7:55am

The size after turning into Gray isn’t [10,244,244] but [10,1,244,244]
You can do this easily with

torchvision.transforms. Grayscale ( num_output_channels=1 )

here is a little tutorial for this:
https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

If you need the form [10,224,224] you cann use

pictures.view(10,244,244)

tom · July 11, 2019, 8:45am

So the lazy way to do this is gray = img.mean(1) (but be careful when you have an alpha channel).
However, that isn’t a good way, as RGB are not equally bright. People have thought about this and came up with various weights. A great way to apply these weights is to carry out a pointwise convolution (that you also see in ResNets and friends to change the number of channels, here 3 channels in, one out, 1x1 pixel, use torch.nn.functional.conv_2d with a weight of shape 1, 3, 1, 1). But weighted average is not the end of the story either, there is gamma correction etc.

There are many more links at stack overflow, particularly noteworthy is this study.

Best regards

Thomas

P.S.: @Filos92’s way works on PIL images not tensors, but uses whatever PIL provides for us (hopefully a good choice). The more PyTorch-y way to get rid of the singleton dimension would be img_gray.squeeze(1) rather than .view.