PIL Image to FloatTensor (uint16 to float32)

I have tif images that have a data type of unsigned int 16 bit.
PIL can read these images without problem and have the correct type.
But as my custom data set reader reads this tif image and then tries to contert it to a tensor, for the normal normalization and then usage in the network, things goes wrong.

I read the image which have values from zero to the max value of uint16.
then i use the standard way transformation:

image_transform = transfroms.Compose([
          transforms.ToTensor(),
          transforms.Normalize((value,), (value,))
)]

This code will result in errors.
As the ToTensor function will convert images with value between [0, 255] and of some specific format to a tensor with values between [0,1] (thus a float tensor).
But for other images it will keep the same datatype and just convert the values.

So for this case it will take the data type closes to a unsigned int16 which is a signed int16
This will result in overflows and non correct data.

So the question is how to do it an easy (using torch) and fast way?

They way i do it is to first convert to a numpy array; then convert to a signed float 32 then to a float tensor, that can be used as normal.

image_fp = open("filepath", "rb")
image = PIL.Image.open(image_fp)
im_arr = np.array(image)
im_arr32 = im_arr.astype(np.float32)
im_tensor = torch.tensor(im_arr32)
im_tensor = im_tensor.unsqueeze(0)

And this results in this ugly lambda:

image_transform = transfroms.Compose([
          transforms.Lambda(lambda image: torch.tensor(numpy.array(image).astype(numpy.float32)).unsqueeze(0)),
          transforms.Normalize((value,), (value,))
)]

(All these conversion will impact the loading time with large datasets)

[edit]
So the differense that can save alot of time is to use:

transforms.Lambda(lambda image: torch.from_numpy(numpy.array(image).astype(numpy.float32)).unsqueeze(0))

instead of the torch.tensor function

3 Likes

Your approach seems valid.
You could maybe save a unnecessary copy by using torch.from_numpy instead of creating a new tensor.

Yes, i made a little script with a for loop where i read an tif image and then did the conversion. This was repeated 100 000 times (with nothing else in the loop) and timed in linux. I saved about 30 seconds by using from_numpy (torch.tensor ~ 2 min, torch.from_numpy ~1.5 min for 100K repetitions)