Error showing - pic should be 2/3 dimensional. Got 1 dimensions

Hi All,

I am trying to solve a binary image classification problem. I am trying to implement feed forward neural network use.

I am getting a problem - like this.

File “/opt/conda/lib/python3.7/site-packages/torchvision/transforms/functional.py”, line 104, in to_pil_image
raise ValueError(‘pic should be 2/3 dimensional. Got {} dimensions.’.format(pic.ndimension()))
ValueError: pic should be 2/3 dimensional. Got 1 dimensions.

Here is my code.

Please let me know where is the problem . CODE HERE

It would be great if someone points the problem at my code location.

I have just joined this forum. Pardon me if this is not the right place to ask these questions.(I’m Beginner in Pytorch use)

Regards
Shravan

In your datasets you are flattening the tensors via:

data = data.view(1, -1)

which seems wrong, as the expected shape of a color image tensor would be [3, height, width].
Could you check the shape of X_train etc. before passing them to the dataset and make sure the image shape is correct?

Thanks very much for the reply.

the input size of each image is (224, 224, 3).

I tried changing the code to---- data = data.view(-1, 224, 224, 3)

I get the following error.

RuntimeError: size mismatch, m1: [32 x 2016], m2: [150528 x 1536] at /opt/conda/conda-bld/pytorch_1587428398394/work/aten/src/THC/generic/THCTensorMathBlas.cu:283

Initially it was giving me this error . Hence to abovid the above error i changed the line – data = data.view(1, -1).
But then it gives other error.

You think we can resolve this?

Thanks & Regards
Shravan

Apply transform first and then flatten it using view.

I already did this like

transform = transforms.Compose(
[transforms.ToPILImage(), transforms.ToTensor()])

and did this

class DatasetProcessing(Dataset):

#initialise the class variables - transform, data, target
def __init__(self, data, target, transform=None): 
    self.transform = transform     
    data = data.view(1, -1)

    self.data =  data
    # converting target to torch.LongTensor dtype
    self.target = target 

#retrieve the X and y index value and return it
def __getitem__(self, index):   
    return (self.transform(self.data[index]), self.target[index])

#returns the length of the data
def __len__(self): 
    return len(list(self.data))

Try this. Please note that, in your code, you are applying transform after view.

#initialise the class variables - transform, data, target
def __init__(self, data, target, transform=None): 
    data = transform(data)
    self.data =  data.view(1, -1)
    # converting target to torch.LongTensor dtype
    self.target = target 

#retrieve the X and y index value and return it
def __getitem__(self, index):   
    return (self.data[index], self.target[index])

#returns the length of the data
def __len__(self): 
    return len(list(self.data))

transform is only instantiated and is not applied until you explicitly apply it to an image, which you are in the __getitem__ method. But note that you are flattening your data/image in the __init__ method prior to applying your transform. I hope this clarifies my earlier comment on- transform after view.

Thanks for the reply and clear explanation.
I just got this error again
" [ValueError: pic should be 2/3 dimensional. Got 4 dimensions]".

I believe i am missing a small trick to convert image shape to the right format. Any idea?

From what I understand, torchvision.transforms only work with one image at a time and not a batch of images. I guess this should do.

#initialise the class variables - transform, data, target
def __init__(self, data, target, transform=None): 
    self.transform = transform     
    self.data =  data
    # converting target to torch.LongTensor dtype
    self.target = target 

#retrieve the X and y index value and return it
def __getitem__(self, index):   
    return (self.transform(self.data[index]).view(1, -1), self.target[index])

#returns the length of the data
def __len__(self): 
    return len(list(self.data))

No it is not working with the above code.

RuntimeError: size mismatch, m1: [32 x 2016], m2: [150528 x 1536] at /opt/conda/conda-bld/pytorch_1587428190859/work/aten/src/TH/generic/THTensorMath.cpp:41

@harsha_g

I changed class like this to pass image into a transform function.

class DatasetProcessing(Dataset):

#initialise the class variables - transform, data, target
def __init__(self, data, target, transform=None): 
     #self.transform = transform 
    print("Before transformation")
    print(data.size())

    print(data[0].size())

    outputs = []
    datalen = data.size()[0]
    for i in range(datalen):          
        tensor = transform(data[i,:,:,:])   #transform 
        outputs.append(tensor)
    result = torch.cat(outputs, dim=1)  #shape (64, 32*in_channels, 224, 224)        

    print("After transformation")
    print(result.size())
    data = result
    self.data =  data.view(1, -1)
    # converting target to torch.LongTensor dtype
    self.target = target 

#retrieve the X and y index value and return it
def __getitem__(self, index):        
    return (self.data[index], self.target[index])
    
     return (self.transform(self.data[index]), self.target[index])

#returns the length of the data
def __len__(self): 
    return len(list(self.data))
  • OUTPUT from prints.

Before transformation
torch.Size([1316, 224, 224, 3])
torch.Size([224, 224, 3])
After transformation
torch.Size([3, 294784, 3])

Errror Message:

RuntimeError: size mismatch, m1: [3 x 221760], m2: [150528 x 1536] at /opt/conda/conda-bld/pytorch_1587428190859/work/aten/src/TH/generic/THTensorMath.cpp:41

Output from validation output: (Actually the output has to be showing 32 batches with 32 labels but am not sure why is it throwing belwo output
**IMAGES
torch.Size([3, 73920, 3])
**LABEL
tensor([0., 0., 0.])

Hi, it was really difficult to figure out the issue without the full stack trace. However, I just took a look at your code and it’s failing in the evaluate function all the way at the bottom. The size of your labels look fine after passing through the DataLoader. It’s failing at self(images). You might want to revisit your EmergencyVehicleDetectionModel class and ensure the size of the inputs and targets match as intended.

THanks for the reply. So do you think whatever the code I wrote earlier was good (as you said my github code looks fine only place I need to see is evaluate).
So what about the modified code I wrote above ? To which code I need to stick and debug?The above modified code or my github code!

@shravan I took another pass at your code and figured out some errors but trying to correct one led to another and so on and I couldn’t take it further in the interest of time (sorry about that). But, I’ll try to help you start so you know what exactly is the issue. One, the input image which is 224x224x3 is transformed using ToPILImage and further flattened using view. Try to understand the shape of the output (3x224x3 = 2016). Later on, when you define your EmergencyVehicleDetectionModel class you hard-coded an input of size 224x224x3 = 150528. I hope you can see what’s happening here. You are feeding a tensor of shape 2016 to a network expecting a tensor of shape 150528. I hope this will help you understand where to look in your code and how to go about it. The other follow-up issues are not really difficult to address and some intuitive understanding of how various NN layers and also loss functions work and what shapes are expected will come in very handy. I urge you to spend more time going through PyTorch documentation for every function you use. I wish you the best.

Thanks . This is really great and so kind of you for all the follow up answers.