Convert float image array to int in pil via Image.fromarray

I have a pytorch tensor of shape 3,256,256 where 3 in the channel and 256,256 are the color image dimensions, with all float values.

I am trying to feed this into the network and using PIL to do some transforms. To this end, I do:

img = Image.fromarray((255*imgs[i]).numpy().astype(np.uint8))

but I get:

TypeError: Cannot handle this data type: (1, 1, 256), |u1

When I check the output of (255*imgs[i]).numpy().astype(np.uint8) , I do however see, for example:

[[[ 62  57  59 ...  63  46  36]
  [ 72  71  67 ...  80  76  82]
  [ 58  63  63 ... 145 152 169]
  ...
  [238 240 243 ...   7   7   7]
  [241 239 240 ...   5   5   6]
  [241 243 242 ...   4   3   5]]

 [[ 83  78  80 ...  86  70  61]
  [ 91  90  85 ...  95  93  98]
  [ 80  83  80 ... 141 150 168]
  ...
  [176 178 181 ...  14  14  14]
  [177 176 178 ...  15  15  17]
  [179 180 180 ...  13  13  15]]

 [[147 141 143 ... 150 136 128]
  [147 149 148 ... 154 149 154]
  [141 149 148 ... 178 182 196]
  ...
  [129 131 134 ...  43  43  43]
  [130 130 131 ...  45  45  47]
  [133 134 133 ...  44  44  46]]]

I am not an image expert by a long shot, and I am struggling to troubleshoot this issue now.

Hi,

If you want to convert any tensor into PIL images, PyTorch already has a function that supports all modes defined in PIL and automatically converts to proper mode. Here is a snippet that simulates your case:

from torchvision.transforms import ToPILImage  # built-in function

x = torch.FloatTensor(3, 256, 256).uniform_(0, 1)  # [0, 1] float matrix
img = ToPILImage()(x)  # image corresponding to x

If you try to print values of img using print(np.array(img), it has been already converted to [0, 255].

But about your current approach which is right logically. The only problem is that numpy consider images in [height, width, Channel] format meanwhile PIL and PyTorch, expect inputs in [Channel, height, width] format. So, you have size mismatch in Image.fromarray().

To solve this issue, the dimensions just need to be swapped.

img = Image.fromarray((255*imgs[i].permute(1, 2, 0)).numpy().astype(np.uint8))

bests,
Nik

2 Likes

Hi Nik,

FIRSTLY, thank you VERY much for your advice and help.
I do follow what you are saying, but for some reason, I still get the same error.

I added the transforms as you suggested, passed it to imageloader class, and then I do something like this in __getitem__:

def __getitem__(self, index):
    imgs, labels = self.train_data[index], self.train_labels[index]
        
    img_ar = []
    for i in range(len(imgs)):
        print(imgs[i].numpy().shape) #output: (3, 256, 256)
        img = Image.fromarray(imgs[i].permute(1, 2, 0).numpy())
        img = self.train_data_transforms(img)
        ..<do more stuff>
    return img_ar, labels

But I get:

    raise TypeError("Cannot handle this data type: %s, %s" % typekey)

TypeError: Cannot handle this data type: (1, 1, 3), <f4

Shouldnt it be (256,256,3)? :frowning:

Could you print stacktrace of error?

And what is the reason behind this line? In first post, you mentioned that imgs are float between [0, 1], etc.

So proper change would be

If you are using PyTorch transforms, still best approach is to use transforms.ToTensor and transforms.ToPILImage in Compose of transformation.

Actually Nik, I was doing something stupid. Indeed, your solution works like a charm. Many thanks for your help, REALLY appreciate it (was stuck on this for 1 whole day!). You saved me!

BTW: I did not use the transforms - the problem with this is that I need to convert it to tensor, then PIL and then tensor again(?) since the rest of the code expects a tensor.

I presume doing it manually is still the same?

1 Like

No problem mate, we have to begin from somewhere but we finally learn!

Yes, in the end, all modules need tensor input. So, you are right, somehow, if you want to use few functions, you have convert to PIL and then to Tensor. For instance, assume these transforms:

custom_transforms = Compose([
    RandomResizedCrop(size=224, scale=(0.8, 1.2)),
    RandomRotation(degrees=(-30, 30)),
    RandomHorizontalFlip(p=0.5),
    ToTensor(),
    Normalize(mean=mean, std=std),
    RandomNoise(p=0.5, mean=0, std=0.1)])

Note that I had to use ToTensor if I wanted to use built-in Normalize. Although, the last transformation RandomNoise is not part of PyTorch and I have implemented it, but still you can define its inputs so it can work with tensors (in my example) or PIL image.

class RandomNoise(object):
    def __init__(self, p, mean=0, std=0.1):
        self.p = p
        self.mean = mean
        self.std = std

    def __call__(self, img):
        if random.random() <= self.p:
            noise = torch.empty(*img.size(), dtype=torch.float, requires_grad=False)
            return img+noise.normal_(self.mean, self.std)
        return img

On top of that, all modules in PyTorch are available as a function under torch.nn.functional so for instance if you want the class type, use transforms.ToTensor but if you need it as a function in any place of your code, just use functional.to_tensor().

That’s why I love PyTorch, really easy to read and use, as you said, like a charm!

If you are new to PyTorch, I highly suggest you to read all tutorial in official website/github. It won’t take much time but will enable you to understand the whole idea easily.

bests

you are awesome Nik, Thank you for the snippet. The problem I had with PIL transform was that I was using it with a dataloader like so:

        p = transforms.Compose([\
                                torchvision.transforms.Resize((512,512)),\
                                torchvision.transforms.ToTensor(),\
                                ])
        trainingds = torchvision.datasets.ImageFolder(  root=train_data_path,transform=p   )

but when I added PIL to this list, I still could see floats. I did not give it much thought as to why it wouldn’t work (I use this trainingds to perform tensor operations and I noticed that the other solution (when I used PILTransform in get_item), so I went for this instead.

Also, I think keeping the PIlTransform in def __getitem__(self, index): seems a cleaner solution.

Well, once again, thanks a tonne.