Displaying images loaded with pytorch dataloader thresholds image

CDWatson · March 26, 2021, 2:41pm

I am working with some lidar data images that when loading the same images using pytorch ImageFolder and Dataloader with the only transform being converting the images to tensors there seems to be some extreme thresholding and I can’t seem to locate the cause of this.

Below is how I’m displaying the first image:

dataset = gdal.Open(dir)

print(dataset.RasterCount)
img = dataset.GetRasterBand(1).ReadAsArray() 

f = plt.figure() 
plt.imshow(img) 
print(img.shape)
plt.show()

and here is how I am using the data loader and displaying the thresholded image:

Note: This isn’t the same image as above, the thresholded images do correspond with their original counterpart and its not the images being edited completely.

data_transforms = {
        'train': transforms.Compose([
            transforms.ToTensor(),
        ]),
        'val': transforms.Compose([
            transforms.ToTensor(),
        ]),
    }

image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                              data_transforms[x]) for x in ['train', 'val']}
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 

dataloders = {x: torch.utils.data.DataLoader(image_datasets[x],
                                                 batch_size=1,
                                                 shuffle=True,
                                                 num_workers=2) for x in ['train', 'val']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']}

for image in dataloders["train"]:
  f = plt.figure() 
  print(image[0].shape)
  plt.imshow(image[0].squeeze()[0,:,:]) 
  plt.show() 
  break

Any help on an alternative way to display the images or any mistakes I am making would be greatly appreciated.

ptrblck · March 29, 2021, 5:11am

I’m not sure I understand this statement completely.
Are all images visualized in an unexpected way or only a subset?

I don’t know what the range of the pixel values in the original image was, but note that ToTensor normalizes the data values to the range [0, 1].
Could you check, if matplotlib is using a wrong format/colormap and/or if the values in the transformed image look expected?

CDWatson · March 29, 2021, 10:13am

I believe it is the ToTensor change that is disrupting the values. Here are the values of an image before ToTensor being applied:

[[257.10788 257.09265 257.07593 ... 252.98647 252.97443 252.96442]
 [257.09814 257.08295 257.06628 ... 252.99252 252.98094 252.97125]
 [257.0885  257.07336 257.0568  ... 252.99867 252.98766 252.97833]
 ...
 [255.94922 255.93744 255.92479 ... 253.86107 253.8524  253.84364]
 [255.95549 255.94397 255.93147 ... 253.86275 253.85414 253.84538]
 [255.96043 255.94917 255.93686 ... 253.86534 253.85675 253.84795]]

and the values after:

tensor([[[0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         [0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         [0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         ...,
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647],
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647],
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647]],

        [[0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         [0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         [0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         ...,
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647],
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647],
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647]],

        [[0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         [0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         [0.9804, 0.9804, 0.9804,  ..., 0.9725, 0.9725, 0.9725],
         ...,
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647],
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647],
         [0.9725, 0.9725, 0.9725,  ..., 0.9647, 0.9647, 0.9647]]])

As you can see the tensor values come in blocks of the same number which is what is resulting in the change but I’m not sure why this is the case.

CDWatson · March 29, 2021, 10:25am

And to clarify on my note, yes all the images are visualised this way. My note was just to point out that the first image was not the same image as displayed in the bottom but two randomly selected images from the before and after stage. Here is an example of the same image before and after the process:

Before:

After:

As you can see the pattern does match for both images, however the precise colour is lost due to the thresholding effect.

CDWatson · March 29, 2021, 2:08pm

I have realised this isn’t an issue with ToTensor but rather an issue with loading the images with ImageFolder. The images I am loading are 256x256, however when loaded with ImageFolder they become 3x256x256, why is it adding this dimension?

ptrblck · March 29, 2021, 4:07pm

By default the pil_loader would be used, which returns images in RGB and thus adds the channel.
You could provide another loader to ImageFolder and make sure the images are loaded in the desired format.

CDWatson · March 29, 2021, 4:26pm

What would be an appropriate loader for grayscale Tif files?

ptrblck · March 30, 2021, 4:46am

Could you try to load some example images directly via PIL.Image.open and see, if numpy arrays would have the expected shape?
If so, you could provide a custom loader function instead of the pil_loader and just remove the convert('RGB').