Runtime Error: Given groups=1, weight of size 32 32 3 3, expected input[32, 3, 178, 178] to have 32 channels, but got 3 channels instead

pedro · December 7, 2019, 3:20pm

I’m new to Pytorch and trying to implement the BKNet network from the article “Facial smile detection using convolutional neural networks”.

I wrote the following Neural network code:

model = nn.Sequential(nn.Conv2d(32, 32, kernel_size=(3,3), stride=1),
                      nn.Conv2d(32, 32, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(64, 64, kernel_size=(3,3), stride=1),
                      nn.Conv2d(64, 64, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Linear(256,256),
                      nn.Linear(256,256),
                      nn.ReLU(),
                      nn.Linear(256,2),
                      nn.LogSoftmax(dim=1))

And trained it using the training code here and an image database with many images of size 178x218, which are transformed in the following way:

data_transforms = {
    'train': transforms.Compose([
        transforms.Resize(178),
        transforms.CenterCrop(178),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    ]),
    'val': transforms.Compose([
        transforms.Resize(178),
        transforms.CenterCrop(178),
        transforms.ToTensor(),
        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    ]),
}

But I’m getting the following error:

RuntimeError: Given groups=1, weight of size 32 32 3 3, expected input[32, 3, 178, 178] to have 32 channels, but got 3 channels instead

What have I done wrong?

pedro · December 7, 2019, 3:42pm

Think I fixed it by changing the input of the first convolutional layer of each block to have the same number as the output of the last convolutional layer of the previous block.

pedro · December 7, 2019, 3:42pm

Now getting this error:

RuntimeError: size mismatch, m1: [163840 x 5], m2: [256 x 256] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197

Could someone provide some help?

ptrblck · December 7, 2019, 8:29pm

This error is most likely thrown, since you are not including a flatten operation after your last pooling layer. Add nn.Flatten right before your nn.Linear and run the code again.

pedro · December 9, 2019, 12:06pm

I tried it but I’m still getting the same error

Eta_C · December 9, 2019, 12:21pm

As @ptrblck said, one of your error is that you should add nn.Flatten before nn.Linear.
Another, change first line
model = nn.Sequential(nn.Conv2d(32, 32, kernel_size=(3, 3), stride=1),
to
model = nn.Sequential(nn.Conv2d(3, 32, kernel_size=(3, 3), stride=1),
as your input is a tensor with 3 channels.

pedro · December 9, 2019, 12:38pm

I changed it to this:

model = nn.Sequential(nn.Conv2d(3, 32, kernel_size=(3,3), stride=1),
                      nn.Conv2d(32, 32, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(32, 64, kernel_size=(3,3), stride=1),
                      nn.Conv2d(64, 64, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(64, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.Conv2d(128, 128, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Conv2d(128, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.Conv2d(256, 256, kernel_size=(3,3), stride=1),
                      nn.MaxPool2d(kernel_size=(2,2), stride=2),
                      nn.ReLU(),
                      nn.Flatten(),
                      nn.Linear(256,256),
                      nn.Linear(256,256),
                      nn.ReLU(),
                      nn.Linear(256,2),
                      nn.LogSoftmax(dim=1))

But still getting this error:

RuntimeError: size mismatch, m1: [32 x 25600], m2: [256 x 256] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

Eta_C · December 9, 2019, 1:05pm

nn.MaxPool2d(kernel_size=(2, 2), stride=2),  # output shape: [N, C, H, W]
nn.ReLU(),  # output shape: [N, C, H, W]
nn.Flatten(),  # output shape: [N, C * H * W]
nn.Linear(256 * 10 * 10, 256),  # set num_features to C * H * W
nn.Linear(256, 256),

pedro · December 9, 2019, 2:44pm

That fixed that error, but now getting this one:

RuntimeError: size mismatch, m1: [32 x 6400], m2: [25600 x 256] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

I’m quite new to Pytorch and CNNs, so could you point me towards a resource that walks me through what the shape of each layer should be and what each component is?

Eta_C · December 9, 2019, 2:57pm

if your input is (N, 3, 178, 178),

nn.Linear(256 * 10 * 10, 256)
→
nn.Linear(256 * 5 * 5, 256)

pedro · December 9, 2019, 3:23pm

It works now! Thank you so much!