Expected 4-dimensional input for 4-dimensional weight 16 3 3 3, but got 2-dimensional input of size [136, 408] instead

if i pass the image to the my model i got this error

my model is

class mandal(nn.Module):
    def __init__(self):
        #layer1 inpit (136,136 *3) output (136,136)
        #layer2 input (136,136*16) output (136,136*32)
        #layer3 input(136,136*32) output(640,480*64)
        #layer4 input(136,136*64) output(66,66*64)
        #layer5 input(66,66*64) output (33,33*128)
        #layeer6 input(33,33*128) output(17,17*128)
        #layer7 input(17,17*128) output(8,8*192)
        #layer8 input(8,8*192) output(8,8*120)
        #layer9 input(8,8*94) output(8,8*64)
        #layer10 input(8,8*64) output(8,8*32)
        #layer11 input(8,8*32) output(8,8*16)
        #layer12 input(8,8*16) output(1024)
        #layer13 input(1024) 0utput(500)
        #layer14 input(80) output(10)
        #layer15 input(10) output(3)
        #layer12 input(8,8*16) output(1024)
        #layer13 input(1024) 0utput(500)
        #layer14 input(80) output(10)
        #layer15 input(10)  output(3)
        #dropout p=0.20
    def forward(self,x):
        x = x.view(x.size(0), -1)
      # x = self.encoder(x)
      # x = x.unsqueeze(1)
        x = F.relu(self.layer1(x))
        x = F.relu(self.layer2(x))
        x = F.relu(self.layer3(x))
        x = self.layer4(x)
        x = F.relu(self.layer5(x))
        x = F.relu(self.layer7(x))
        x = F.relu(self.layer8(x))
        x = F.relu(self.layer9(x))
        x = F.relu(self.layer10(x))
        x = self.dropout(x)
        x = self.layer6(x)
        x = self.falten(x)
        y = x
        x = F.relu(self.layer12(x))
        x = self.dropout(x)
        x = F.relu(self.layer13(x))
        x = self.dropout(x)
        x = F.relu(self.layer14(x))
        x = self.dropout(x)
        x = F.relu(self.layer15(x))
        y = F.relu(self.layer16(y))
        y = self.dropout(x)
        y = F.relu(self.layer17(y))
        y = self.dropout(x)
        y = F.relu(self.layer18(y))
        y = self.dropout(x)
        y = F.relu(self.layer19(y))
        return x,y


Conv2d accepts inputs in this manner [batch_size, channel_size, height, width]. So, for an rabitrary input x = torch.randn(1, 3, 136, 136) you are resizing it into 2d tensor which is not proper for a Conv2d operator which takes 3d input. In the error also you can see that it says “got 2d tensor with size of (136, 408)” where 408=136*3. if you remove the aforementioned line, your model will work properly. Although, layer11 has never been used which leads to another mismatch error in layer 17.


Hi Nikronic I can’t understand can you explain more clearly . You describe well but I am not able to understand it it’s my bad.

No problem.,
The issue is you have defined Conv2d. It expects for [channel, height, width] input at least. But when you passing your input to forward method, you are reshaping it into a [sth, sth] which is not proper.

for instance, just run this code:

x = torch.randn(1, 3, 136, 136)
x = x.view(x.size(0), -1)
print(x.shape)  # (xxx, xxx)

So, that line is changing the shape of tensor to something that Conv2d layers cannot accept.

Now run this code:

x = torch.randn(1, 3, 136, 136)
# x = x.view(x.size(0), -1)
x = nn.Conv2d(3,16,3,stride=1,padding=1)
print(x.shape)  # (1, 16, 136, 136)

which works just fine. But if you uncomment the second line (currently commented), you get your mentioned error.
The whole idea is if you are not going to use Fully Connected layer or you are just using Conv2d, your input has to be in the form I previously explained.

My image size is (136,136,3)
After applying x.view(x.size(0),-1) torch.size(136,408)
How to convert my image to (1,3,136,136)
If I use torch.rand(1,3,136,136) I got an other error size mismatch , m1: [1×2] , m2: [580×80]
Error at line y=F.relu(self.layer17(y))

How to remove error of line 17 how layer 11 is not use

About image size, you can load images channel-first. PyTorch uses channel-first by default or you can convert your code to accept channel-last tensors.

The reason I considered (1, 3, 136, 136) is that probably you are using batch operations. So you can aggregate let’s say 1000 images in one tensor (1000, 3, 136, 136).

about your error, I’ve mentioned it in first post. You network is not properly constructed.

And you have not explained why you are using this line.

This line is not a part of code I just tray to over come error find this line to resolve the error
My basic code start from layer1
I again calculate the network and tray again if found any error tell you about that
Thank you for your help it clears my concepts