Not displaying loaded images

Surekha_Gaikwad · May 18, 2019, 3:07pm

I am trying to load images from folder using dataloader but its giving me below error. Not sure why its happening…I am completely novice in pytorch…please suggest what is the potential solution for this.

folder structure:-
./data/Disguised/Set/
./data/Original/Set/
All images are present under this hierarchy in jpg format.

O

SError: Traceback (most recent call last):
  File "/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
  File "/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
  File "/anaconda3/lib/python3.7/site-packages/torchvision/datasets/folder.py", line 132, in __getitem__
    sample = self.loader(path)
  File "/anaconda3/lib/python3.7/site-packages/torchvision/datasets/folder.py", line 178, in default_loader
    return pil_loader(path)
  File "/anaconda3/lib/python3.7/site-packages/torchvision/datasets/folder.py", line 160, in pil_loader
    img = Image.open(f)
  File "/anaconda3/lib/python3.7/site-packages/PIL/Image.py", line 2687, in open
OSError: cannot identify image file <_io.BufferedReader name='./data/Disguised/set/Marisa_Tomei_h_003.jpg'>

ptrblck · May 18, 2019, 3:10pm

This error seems to be related to PIL.
Is this particular image still opened in another process or did you just write to the file without a flush?
Also, did you (accidentally) imported Image instead of from PIL import Image?

Surekha_Gaikwad · May 18, 2019, 3:27pm

I am not using PIL for importing images in my code.
No I havent opened the image in any other process I am just using dataloader

Surekha_Gaikwad · May 18, 2019, 3:31pm

Below is the code…

transform = transforms.Compose([
        transforms.Resize(32),
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
 
dataset = datasets.ImageFolder(root="./data/", transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True, num_workers=2)


# functions to show an image
def imshow(img):
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

    
# get some images
dataiter = iter(dataloader)
images = dataiter.next()

Surekha_Gaikwad · May 18, 2019, 4:16pm

Now getting below error after resolving the previous issue.

  File "/Users/surekhagaikwad/Documents/COMP8420_Ass2/cnn_draft.py", line 37, in <module>
    images = dataiter.next()

  File "/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in __next__
    return self._process_next_batch(batch)

  File "/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)

RuntimeError: Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 68, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 68, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 43, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 48 and 42 in dimension 2 at /Users/distiller/project/conda/conda-bld/pytorch_1556653492823/work/aten/src/TH/generic/THTensor.cpp:711

ptrblck · May 18, 2019, 4:46pm

It looks like the Resize is not being applied.
Could you set batch_size=1 and print the shape of your data?

Surekha_Gaikwad · May 18, 2019, 4:47pm

yes I did that and it worked but could not understand why?

ptrblck · May 18, 2019, 4:48pm

Because the size mismatch error is not thrown anymore if you are not trying to stack the tensors.
That is why the shape information would be interesting to see.

Surekha_Gaikwad · May 18, 2019, 4:51pm

but I could not understand the logic behind it

ptrblck · May 18, 2019, 4:53pm

It’s just a debugging step to see the shapes of your data.
The error seems to be thrown, since the spatial size of your image tensors differ, although you are using Resize in your transformation.
If you run the code with batch_size=1, you could print the shape using:

for data, target in loader:
    print(data.shape)

so that we can debug this issue further.

Surekha_Gaikwad · May 18, 2019, 4:55pm

hmm thanks @ptrblck

ptrblck · May 18, 2019, 4:56pm

We would still need the output shapes!
So could you please post them here?

Setting the batch_size=1 is just a way to debug your code and is not the solution to this issue.

Surekha_Gaikwad · May 18, 2019, 4:58pm

this is what I am getting.

torch.Size([1, 1, 48, 32])
torch.Size([1, 1, 42, 32])

ptrblck · May 18, 2019, 5:06pm

Thanks for the output.
Try to use Resize((32, 32)) and run your code again with the original batch size.

Surekha_Gaikwad · May 18, 2019, 5:10pm

thats kind of weird…now its showing me two images instead of one

ptrblck · May 18, 2019, 5:11pm

Could you explain it a bit?
Which shape does your data batch have now?

Surekha_Gaikwad · May 18, 2019, 5:12pm

torch.Size([2, 1, 32, 32])

Surekha_Gaikwad · May 18, 2019, 5:13pm

this is how I am setting batch_size and resize …

transform = transforms.Compose([transforms.Grayscale(num_output_channels=1),
        transforms.Resize((32,32)),
        transforms.ToTensor(),
        transforms.Normalize([0.5], [0.5])])
 
dataset = datasets.ImageFolder(root="./data/", transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True, num_workers=1)

ptrblck · May 18, 2019, 5:14pm

How many images are stored in the subfolders in ./data/?

Surekha_Gaikwad · May 18, 2019, 5:14pm

as of now I have kept only two images for testing purpose