Error: Given groups=1, weight of size [], expected input[]

So, I’m getting the error:
Given groups=1, weight of size [64, 32, 3, 3], expected input[128, 3, 32, 32] to have 32 channels, but got 3 channels instead

What’s interesting is when I get the error.

So for

class CNN(nn.Module):
  def __init__(self, K):
    super(CNN, self).__init__()
    
    # define the conv layers
    self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=2)
    self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=2)
    self.conv3 = nn.Conv2d(64, 128, kernel_size=3, stride=2)

    # define the linear layers
    self.fc1 = nn.Linear(128 * 3 * 3, 1024)
    self.fc2 = nn.Linear(1024, K)
  
  def forward(self, x):
    x = F.relu(self.conv1(x))
    x = F.relu(self.conv2(x))
    x = F.relu(self.conv3(x))
    x = x.view(-1, 128 * 3 * 3)
    x = F.dropout(x, p=0.5)
    x = F.relu(self.fc1(x))
    x = F.dropout(x, p=0.2)
    x = self.fc2(x)
    return x

I don’t get any error.

What I attempt to do is to be able to set some lists containing the Number of Units, Layer types, etc. so that I can dynamically build different architectures. I’ve began by making dynamic only the conv part, so here are the transformations:

conv_filter_size = (3, 3)
conv_stride = (2, 2)
num_units = [C1, 32, 64, 128] #C1 = 3 in this case

class CNN(nn.Module):
    
    def __init__(self, K):
        super(CNN, self).__init__()
        
        # define the conv layers
        for layer in range(1, len(num_units)):
            exec(f'self.conv{layer} = nn.Conv2d({num_units[layer-1]}, {num_units[layer]}, kernel_size = {conv_filter_size}, stride = {conv_stride})')
        
        self.fc1 = nn.Linear(128 * 3 * 3, 1024)
        self.fc2 = nn.Linear(1024, K)
        
    def forward(self, x):
        for layer in range(1, len(num_units)):
            exec(f'x = F.relu(self.conv{layer}(x))')
        
        x = x.view(-1, 128 * 3 * 3)
        x = F.dropout(x, p=0.5)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, p=0.2)
        x = self.fc2(x)
        return x

Now, I want to make super clear that the ONLY difference between getting the error is how the CNN class is declared.
Also, both produce the same result when I call:

model = CNN(K)

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
model.to(device)

and the result is:

cuda:0

CNN(
(conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2))
(conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2))
(conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2))
(fc1): Linear(in_features=1152, out_features=1024, bias=True)
(fc2): Linear(in_features=1024, out_features=10, bias=True)
)

Input data shape: torch.Size([128, 3, 32, 32]) when train_loader is called
Now, same data, same structure, same code apart from CNN itself, but the 2nd one gives me this problem. Any ideas why?

It seems exec can’t work with locals().

The below code works though.

for layer in range(1, len(num_units)):
    x = F.relu(getattr(self, f'conv{layer}')(x)
2 Likes

Oh wow, that works beautifully, yes.
Thank you for your help.