Some of the images I have in the dataset are gray-scale, thus, I need to convert them to RGB, by replicating the gray-scale to each band. I am using a transforms.lambda to do that, based on torch.cat. However, this seems to not give the expected results
Example: Let xx be some image of size 28x28, then,
In [67]: xx.shape
Out[67]: torch.Size([28, 28])
In [68]: y =torch.cat([xx,xx,xx],0)
In [69]: y.shape
Out[69]: torch.Size([84, 28])
I expected the size to be [28, 28, 3]. Any suggestions how to resole this?
While loading your images, you could use Image.open(path).convert('RGB') on all images.
If you are using ImageFolder, this functionality should be already there using the default loader.
Alternatively, you could repeat the values:
x = torch.randn(28, 28)
x.unsqueeze_(0)
x = x.repeat(3, 1, 1)
x.shape
> torch.Size([3, 28, 28])
I am using it with with MNIST, and I am using datasets.MNIST dataloader.
Not sure however how to call the conversion Image.open(path).convert('RGB'), as it is already there as you noted.
The MNIST dataset doesn’t convert the images to RGB, but to a grayscale image. Have a look at this line of code.
I assume you are using the MNIST data with another color image set?
If so, you could check in __getitem__, if it’s already a color image, and if not use my second approach to convert it.
Calling .repeat will actually replicate the image data (taking 3x the memory of the original image) whereas .expand will behave as if the data is replicated without actually doing so. Thus .expand is probably better unless you want to change the channels independently of each other.
class NoneTransform(object):
''' Does nothing to the image. To be used instead of None '''
def __call__(self, image):
return image
im = PIL.Image.open(img_path)
T = transforms.Compose([
transforms.ToTensor(),
transforms.Lambda(lambda x: x.repeat(3, 1, 1)) if im.mode!='RGB' else NoneTransform()
])
im_torch = T(im)
NB. As far as I remember, None would not work if used instead of NoneTransform()
Depending on what you want to do. Above the channels are replicated. If you want to to colorize grayscale images, then you need to use some colorization algorithms.
I don’t now if this is something wrong with pillow. But I recognized, that using the convert method from pillow it looses all information from the loaded int32 grayscale image and sets all values to 255. I can confirm that the entropy of the image was definitely higher before I converted the image to “RGB”. Stacking the image by hand is working but results in problems for the image transformations I want to apply.
yes you are correct, any Idea how to convert from int32 to uint8 without clipping? because my images are always get loaded as int32. The mode of the images is set to “I” which results from the docs as int32 pixels.
That won’t be possible as int32 is using 32 bits and has a wider range thanuint8 using 8 bits.
Unless you make sure the original int32 image doesn’t have values <0 and >255 you would clip them.