Bugs with torchvision.transforms.ToPILImage()?

Hi, I was trying to convert an image of tensor into PILImage and then convert it back around by using the following code. But finally, I found the image’s values were modified after the operation tfs.Compose([tfs.ToPILImage(), tfs.ToTensor()]). Is this a bug in torchvision.transforms.ToPILImage() ? Thank you.

from skimage import io, color
from torchvision import transforms as tfs
import numpy as np
import torch


file = 'path/to/image.png'
img = color.rgb2ycbcr(io.imread(file)) / 255
(rows, cols, channel) = img.shape
img_y, img_cb, img_cr = np.split(img, indices_or_sections=channel, axis=2)
tensor_y = torch.from_numpy(img_y).float().view(1, rows, cols)

trans = tfs.Compose([tfs.ToPILImage(), tfs.ToTensor()])
preds = trans(tensor_y)

print((tensor_y.data.numpy()==preds.data.numpy()).all())  # return False
1 Like

I think you are just losing the accuracy due to quantization.
If you transform your FloatTensor to a PIL.Image, it will be scaled to [0, 255] in uint8 type.
This will already quantize your values, as you cannot map all floating values in [0, 1] to a ByteTensor.
The reverse (ToTensor()) thus yields a FloatTensor with these already quantized values.

1 Like

Just ran into the same thing, what’s a good way to view a tensor as an image without the quantization?

Thanks for your answer. I’ve looked into the function to_pil_image(pic, mode=None) and found it explicitly converts a FloatTensor to [0, 255] in uint8 type. I’m just wondering can we keep the type as float so that there won’t be a precision loss?

Currently, I just convert the tensor into numpy.ndarray and visualize it through matplotlib, or implement some other image transformations to ensure the precision. After that, I convert it back to torch.Tenosr.

1 Like