So, I’m getting the error:
Given groups=1, weight of size [64, 32, 3, 3], expected input[128, 3, 32, 32] to have 32 channels, but got 3 channels instead
What’s interesting is when I get the error.
So for
class CNN(nn.Module):
def __init__(self, K):
super(CNN, self).__init__()
# define the conv layers
self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=2)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=2)
self.conv3 = nn.Conv2d(64, 128, kernel_size=3, stride=2)
# define the linear layers
self.fc1 = nn.Linear(128 * 3 * 3, 1024)
self.fc2 = nn.Linear(1024, K)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
x = x.view(-1, 128 * 3 * 3)
x = F.dropout(x, p=0.5)
x = F.relu(self.fc1(x))
x = F.dropout(x, p=0.2)
x = self.fc2(x)
return x
I don’t get any error.
What I attempt to do is to be able to set some lists containing the Number of Units, Layer types, etc. so that I can dynamically build different architectures. I’ve began by making dynamic only the conv part, so here are the transformations:
conv_filter_size = (3, 3)
conv_stride = (2, 2)
num_units = [C1, 32, 64, 128] #C1 = 3 in this case
class CNN(nn.Module):
def __init__(self, K):
super(CNN, self).__init__()
# define the conv layers
for layer in range(1, len(num_units)):
exec(f'self.conv{layer} = nn.Conv2d({num_units[layer-1]}, {num_units[layer]}, kernel_size = {conv_filter_size}, stride = {conv_stride})')
self.fc1 = nn.Linear(128 * 3 * 3, 1024)
self.fc2 = nn.Linear(1024, K)
def forward(self, x):
for layer in range(1, len(num_units)):
exec(f'x = F.relu(self.conv{layer}(x))')
x = x.view(-1, 128 * 3 * 3)
x = F.dropout(x, p=0.5)
x = F.relu(self.fc1(x))
x = F.dropout(x, p=0.2)
x = self.fc2(x)
return x
Now, I want to make super clear that the ONLY difference between getting the error is how the CNN class is declared.
Also, both produce the same result when I call:
model = CNN(K)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
model.to(device)
and the result is:
cuda:0
CNN(
(conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2))
(conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2))
(conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2))
(fc1): Linear(in_features=1152, out_features=1024, bias=True)
(fc2): Linear(in_features=1024, out_features=10, bias=True)
)
Input data shape: torch.Size([128, 3, 32, 32]) when train_loader is called
Now, same data, same structure, same code apart from CNN itself, but the 2nd one gives me this problem. Any ideas why?