Runtime Error: Given groups=1, weight of size 32 32 3 3, expected input[32, 3, 178, 178] to have 32 channels, but got 3 channels instead

I’m new to Pytorch and trying to implement the BKNet network from the article “Facial smile detection using convolutional neural networks”.

I wrote the following Neural network code:

model = nn.Sequential(nn.Conv2d(32, 32, kernel_size=(3,3), stride=1),
                      nn.Conv2d(32, 32, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(64, 64, kernel_size=(3,3), stride=1),
                      nn.Conv2d(64, 64, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Linear(256,256),
                      nn.Linear(256,256),
                      nn.ReLU(),
                      nn.Linear(256,2),
                      nn.LogSoftmax(dim=1))

And trained it using the training code here and an image database with many images of size 178x218, which are transformed in the following way:

data_transforms = {
    'train': transforms.Compose([
        transforms.Resize(178),
        transforms.CenterCrop(178),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    ]),
    'val': transforms.Compose([
        transforms.Resize(178),
        transforms.CenterCrop(178),
        transforms.ToTensor(),
        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    ]),
}

But I’m getting the following error:

RuntimeError: Given groups=1, weight of size 32 32 3 3, expected input[32, 3, 178, 178] to have 32 channels, but got 3 channels instead

What have I done wrong?

Think I fixed it by changing the input of the first convolutional layer of each block to have the same number as the output of the last convolutional layer of the previous block.

Now getting this error:

RuntimeError: size mismatch, m1: [163840 x 5], m2: [256 x 256] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197

Could someone provide some help?

This error is most likely thrown, since you are not including a flatten operation after your last pooling layer. Add nn.Flatten right before your nn.Linear and run the code again.

I tried it but I’m still getting the same error :frowning:

  1. As @ptrblck said, one of your error is that you should add nn.Flatten before nn.Linear.
  2. Another, change first line
    model = nn.Sequential(nn.Conv2d(32, 32, kernel_size=(3, 3), stride=1),
    to
    model = nn.Sequential(nn.Conv2d(3, 32, kernel_size=(3, 3), stride=1),
    as your input is a tensor with 3 channels.

I changed it to this:

model = nn.Sequential(nn.Conv2d(3, 32, kernel_size=(3,3), stride=1),
                      nn.Conv2d(32, 32, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(32, 64, kernel_size=(3,3), stride=1),
                      nn.Conv2d(64, 64, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(64, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(128, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Flatten(),
                      nn.Linear(256,256),
                      nn.Linear(256,256),
                      nn.ReLU(),
                      nn.Linear(256,2),
                      nn.LogSoftmax(dim=1))

But still getting this error:

RuntimeError: size mismatch, m1: [32 x 25600], m2: [256 x 256] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

nn.MaxPool2d(kernel_size=(2, 2), stride=2),  # output shape: [N, C, H, W]
nn.ReLU(),  # output shape: [N, C, H, W]
nn.Flatten(),  # output shape: [N, C * H * W]
nn.Linear(256 * 10 * 10, 256),  # set num_features to C * H * W
nn.Linear(256, 256),

That fixed that error, but now getting this one:

RuntimeError: size mismatch, m1: [32 x 6400], m2: [25600 x 256] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

I’m quite new to Pytorch and CNNs, so could you point me towards a resource that walks me through what the shape of each layer should be and what each component is?

if your input is (N, 3, 178, 178),

nn.Linear(256 * 10 * 10, 256)

nn.Linear(256 * 5 * 5, 256)

It works now! Thank you so much! :relaxed: