Size mismatch error while testing RGBA images with resnet50

ptrblck · February 21, 2019, 3:17pm

If you want to combine both Datasets, you could use torch.utils.data.ConcatDataset.
However, if you would like to create a separate DataLoader for your training and validation Datasets (which is the usual use case), you would need to create separate transformations, separate Datasets and finally create separate DataLoaders. The linked tutorial explains this pretty clearly.

dimple · February 21, 2019, 4:55pm

I have modified imagefolder by removing ‘return img.convert(‘RGB’)’ for including 4 channels.
Now ihow can i see the size of the images after calling imagefolder

image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
                                          data_transforms[x])
                  for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=64,
                                             shuffle=True, num_workers=2)
              for x in ['train', 'val']}

ptrblck · February 21, 2019, 5:46pm

You could get the first sample and print the shape using:

x, y = image_datasets['train'][0]
print(x.shape)

dimple · February 22, 2019, 3:18am

after modifying the imagefolder.py am still getting the size of image as 3x224x224. it is not showing the fourth channel though i removed return img.convert(‘RGB’) in def pil_loader(path):

ptrblck · February 22, 2019, 11:44am

Could you upload a sample image so that I could have a look?
Using this code snippet, I get 4 channels for my input image:

img = Image.open('./alpha.png')
x = torch.from_numpy(np.array(img))
print(x.shape)
> torch.Size([300, 300, 4])

dimple · February 22, 2019, 2:18pm

ran the code and is giving torch.Size([540, 960, 4])

ptrblck · February 22, 2019, 2:21pm

In that case, it looks like it’s working.
You would have to permute the output in your Dataset to return an image tensor of shape [channels, height, width] and it should be fine.

dimple · February 22, 2019, 2:25pm

dimple · February 22, 2019, 2:26pm

this is my image…

ptrblck · February 22, 2019, 2:27pm

Is this the input image are shown by some image viewer on your system or did you visualize it in your script?
In the latter case, could you post the code you’ve used to visualize this image?

dimple · February 22, 2019, 2:29pm

dont have the code with me but it is done in blender that i know.

dimple · February 22, 2019, 4:10pm

Finally confirmed that it is a single channel image

img = Image.open('data_views/val/0/1.ply_whiteshaded_v0.png')
# pixels = list(img.getdata())
pixels =img.load()
print(pixels)

width, height = img.size
data = np.asarray(img)
np.set_printoptions(threshold = np.nan)

print("..........................channel 0................")
print(data[:,0])
print("..........................channel 1................")
print(data[:,1])
print("..........................channel 2................")
print(data[:,2])
print("..........................channel 3................")
print(data[:,3])

as i am getting outputs as follows:
..........................channel 0................
[[ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
 [ 64  64  64 255]
.
.
.
.
.
hope am correct

fujitsu_ivy · February 26, 2020, 10:49am

@ptrblck Your comments and solution helped me! Thank you!