Image pixel values converted from [0,255] to float type

mfcs · March 12, 2020, 11:00am

Hi guys! I am facing some issues related to values of pixels.
In the code below I created the CustomDataset class that inherited from Dataset. The getitem() method converts an image to CIELab color space and returns two tensors: L channel and (a,b) channels.

‘’’
class CustomDataset(Dataset):
“”“Custom Dataset.”""

def __init__(self, root_dir, transform=None):
    """
    Args:
        root_dir (string): Directory with all the images.
        transform (callable, optional): Optional transform to be applied
            on a sample.
    """
    self.root_dir = root_dir
    self.transform = transform
    self.file_list=os.listdir(root_dir)

def __len__(self):
    return len(self.file_list)

def __getitem__(self, idx):
    img = Image.open(self.root_dir+'/'+self.file_list[idx])
    if self.transform is not None:
      img_original = self.transform(img)
      img_resize = transforms.Resize((128,128),2)(img_original)
      
      img_original = np.asarray(img_original)

      img_lab = rgb2lab(img_resize)
      
      img_l = img_lab[:,:,0]-50.
      img_l = np.asarray(img_l)
      
      img_ab = img_lab[:, :, 1:3]
      img_ab = np.asarray(img_ab)         
      
      img_l = torch.from_numpy(img_l)
      img_ab = torch.from_numpy(img_ab.transpose((2, 0, 1)))
   
      return img_l, img_ab

‘’’

So, when I created the image loaders and printed the image values from the batch, the values are in float type instead in interger values from 0 to 255.
‘’’
scale_transform = transforms.Compose([
transforms.Resize((224,224),2),
#transforms.ToTensor()
])

#custom_dataset = ImageFolder(root=’/content/gdrive/My Drive/Colab Notebooks/Colorful_Image_Colorization/Dataset/Lips_1/’, transform=scale_transform)
custom_dataset = CustomDataset(root_dir=dataset_dir, transform=scale_transform)
images_loader = torch.utils.data.DataLoader(dataset=custom_dataset, batch_size=1, shuffle=True)

print(len(custom_dataset))
print(len(images_loader))

for idx,batch in enumerate(images_loader,1):
print(idx)
print(batch)
‘’’

This is the output from printing the batch:
‘’’
[tensor([[[ 35.6241, 31.3336, 30.6141, …, 38.3565, 35.5160, 38.7102],
[ 33.4845, 30.6141, 29.1711, …, 36.5835, 34.0883, 36.9387],
[ 32.4105, 30.2538, 29.1711, …, 35.5160, 36.5835, 37.7532],
…,
[-14.7286, -15.1382, -13.5030, …, -3.1598, -1.9797, -4.7591],
[-11.8763, -13.9110, -13.9110, …, -1.5874, -1.9797, -1.1955],
[ -9.8549, -12.6885, -13.9110, …, -3.9489, -2.7659, 1.1459]]],
dtype=torch.float64), tensor([[[[ 6.5148, 6.6274, 6.6471, …, 3.2241, 3.2588, 3.2199],
[ 6.5699, 6.6471, 6.6875, …, 3.2456, 3.2769, 3.2412],
[ 6.5983, 6.6571, 6.6875, …, 3.2588, 3.2456, 3.7464],
…,
[18.2632, 18.3172, 18.1048, …, 16.4730, 16.3525, 16.5302],
[17.9019, 18.1570, 18.1570, …, 16.3132, 16.3525, 16.2743],
[17.6609, 18.0022, 18.1570, …, 16.5555, 16.4324, 16.0492]],

     [[14.0038, 14.1720, 14.2013,  ...,  9.9716, 10.0447,  9.9627],
      [14.0864, 14.2013, 14.2608,  ..., 10.0169, 10.0824, 10.0078],
      [14.1288, 14.2160, 14.2608,  ..., 10.0447, 10.0169,  9.6244],
      ...,
      [18.0844, 18.1408, 17.9199,  ..., 19.0037, 18.8814, 19.7399],
      [17.7116, 17.9739, 17.9739,  ..., 18.8415, 18.8814, 18.8021],
      [17.4672, 17.8143, 17.9739,  ..., 19.0876, 18.9625, 18.5743]]]],
   dtype=torch.float64)]

‘’’
Could you help me please to understand why this is happening?
I can provide the Google Colab Notebook if it helps.

Best regards,

Matheus Santos.

ptrblck · March 12, 2020, 11:35pm

How did you define rgb2lab?
Based on the dtype I would guess that the numpy array is converted to float64 at one point during the loading and processing.
Could you add print statements into __geitem__ and check the dtype after each operation?

mfcs · March 14, 2020, 9:58pm

Hey!
Sorry for the delay to reply.

I imported from skimage:
from skimage.color import rgb2lab

And then I executed the follow code:
img_lab = rgb2lab(img_resize)

I executed the following code with some prints:

# Making and Configuring the dataset

class CustomDataset(Dataset):
    """Custom Dataset."""

    def __init__(self, root_dir, transform=None):
        """
        Args:
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.root_dir = root_dir
        self.transform = transform
        self.file_list=os.listdir(root_dir)

    def __len__(self):
        return len(self.file_list)

    def __getitem__(self, idx):
        img = Image.open(self.root_dir+'/'+self.file_list[idx])
        if self.transform is not None:
          img_original = self.transform(img)
          print("img_original = self.transform(img)",type(img_original))

          img_resize = transforms.Resize((128,128),2)(img_original)
          print("img_resize = transforms.Resize((128,128),2)(img_original)",type(img_resize))

          img_original = np.asarray(img_original)
          
          img_lab = rgb2lab(img_resize)
          print("img_lab = rgb2lab(img_resize)",img_lab.dtype)

          img_l = img_lab[:,:,0]-50.
          print("img_l = img_lab[:,:,0]-50.",img_l.dtype)
          img_l = np.asarray(img_l, dtype=np.float32)
          print("img_l = np.asarray(img_l, dtype=np.float32)",img_l.dtype)
          
          
          img_ab = img_lab[:, :, 1:3]
          print("img_ab = img_lab[:, :, 1:3]",img_ab.dtype)
          img_ab = np.asarray(img_ab, dtype=np.float32)
          print("img_ab = np.asarray(img_ab, dtype=np.float32)",img_ab.dtype)
          
          img_l = torch.from_numpy(img_l)
          print("img_l = torch.from_numpy(img_l)",img_l.dtype)
          img_ab = torch.from_numpy(img_ab.transpose((2, 0, 1)))
          print("img_ab = torch.from_numpy(img_ab.transpose((2, 0, 1)))",img_ab.dtype)

          print(img_l.shape)
          print(img_ab.shape)
        
          return img_l, img_ab

The information of dtypes are as shown below:

img_original = self.transform(img) <class 'PIL.Image.Image'>
img_resize = transforms.Resize((128,128),2)(img_original) <class 'PIL.Image.Image'>
img_lab = rgb2lab(img_resize) float64
img_l = img_lab[:,:,0]-50. float64
img_l = np.asarray(img_l, dtype=np.float32) float32
img_ab = img_lab[:, :, 1:3] float64
img_ab = np.asarray(img_ab, dtype=np.float32) float32
img_l = torch.from_numpy(img_l) torch.float32
img_ab = torch.from_numpy(img_ab.transpose((2, 0, 1))) torch.float32

Best regards,

Matheus Santos

ptrblck · March 15, 2020, 5:55am

Based on the output it seems rgb2lab transforms the tensor to a float64 array.
Internally this code seems to be called, which expects floating point values to work properly.

mfcs · March 16, 2020, 2:43pm

I see. So, this conversion from tensor to float64 is Ok?
Can I proceed or should I do something?

ptrblck · March 16, 2020, 7:42pm

I think the conversion to float64 is necessary for the transformation.
However, afterwards I would transform it to float32, as you have already done.
I think you could start experimenting with the model now.

mfcs · March 16, 2020, 8:35pm

Ok! I will try here.
Thanks for the help!

Best regards.

Matheus Santos.