Hi, I was trying to convert an image of tensor into PILImage and then convert it back around by using the following code. But finally, I found the image’s values were modified after the operation tfs.Compose([tfs.ToPILImage(), tfs.ToTensor()]). Is this a bug in torchvision.transforms.ToPILImage() ? Thank you.
from skimage import io, color
from torchvision import transforms as tfs
import numpy as np
import torch
file = 'path/to/image.png'
img = color.rgb2ycbcr(io.imread(file)) / 255
(rows, cols, channel) = img.shape
img_y, img_cb, img_cr = np.split(img, indices_or_sections=channel, axis=2)
tensor_y = torch.from_numpy(img_y).float().view(1, rows, cols)
trans = tfs.Compose([tfs.ToPILImage(), tfs.ToTensor()])
preds = trans(tensor_y)
print((tensor_y.data.numpy()==preds.data.numpy()).all()) # return False
I think you are just losing the accuracy due to quantization.
If you transform your FloatTensor to a PIL.Image, it will be scaled to [0, 255] in uint8 type.
This will already quantize your values, as you cannot map all floating values in [0, 1] to a ByteTensor.
The reverse (ToTensor()) thus yields a FloatTensor with these already quantized values.
Thanks for your answer. I’ve looked into the function to_pil_image(pic, mode=None) and found it explicitly converts a FloatTensor to [0, 255] in uint8 type. I’m just wondering can we keep the type as float so that there won’t be a precision loss?
Currently, I just convert the tensor into numpy.ndarray and visualize it through matplotlib, or implement some other image transformations to ensure the precision. After that, I convert it back to torch.Tenosr.
I don’t think the ToTensor transformation is problematic, but the transformation to a uint8 image in ToPILImage() if your use case fits my previous description.
How did you create these floating point images and what do the values represent?