Hi, I was trying to convert an image of tensor into PILImage and then convert it back around by using the following code. But finally, I found the image’s values were modified after the operation
tfs.Compose([tfs.ToPILImage(), tfs.ToTensor()]). Is this a bug in
torchvision.transforms.ToPILImage() ? Thank you.
from skimage import io, color
from torchvision import transforms as tfs
import numpy as np
file = 'path/to/image.png'
img = color.rgb2ycbcr(io.imread(file)) / 255
(rows, cols, channel) = img.shape
img_y, img_cb, img_cr = np.split(img, indices_or_sections=channel, axis=2)
tensor_y = torch.from_numpy(img_y).float().view(1, rows, cols)
trans = tfs.Compose([tfs.ToPILImage(), tfs.ToTensor()])
preds = trans(tensor_y)
print((tensor_y.data.numpy()==preds.data.numpy()).all()) # return False
I think you are just losing the accuracy due to quantization.
If you transform your
FloatTensor to a
PIL.Image, it will be scaled to
[0, 255] in
This will already quantize your values, as you cannot map all floating values in
[0, 1] to a
The reverse (
ToTensor()) thus yields a
FloatTensor with these already quantized values.
Just ran into the same thing, what’s a good way to view a tensor as an image without the quantization?
Thanks for your answer. I’ve looked into the function
to_pil_image(pic, mode=None) and found it explicitly converts a FloatTensor to [0, 255] in uint8 type. I’m just wondering can we keep the type as float so that there won’t be a precision loss?
Currently, I just convert the tensor into
numpy.ndarray and visualize it through
matplotlib, or implement some other image transformations to ensure the precision. After that, I convert it back to
My problem is similar, currently I am using this for augmentation.
augmentations = transforms.Compose([
so should I remove " transforms.ToTensor()" ?
I don’t think the
ToTensor transformation is problematic, but the transformation to a
uint8 image in
ToPILImage() if your use case fits my previous description.
How did you create these floating point images and what do the values represent?