RuntimeError: size mismatch, m1: [100 x 1568], m2: [3072 x 5] - training CNN with colour images

rosie934 · October 29, 2020, 2:35pm

I know there are a lot of similar questions on here with this error, but I have looked at all of them and really am stumped where I have gone wrong (I am very new to pytorch). Any help gratefully received!

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Transforms
damp_transform = transforms.Compose(
    [transforms.ToPILImage(),
     transforms.Resize((28,28)), 
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

# Hyperparameters
num_epochs = 5;
batch_size = 100;
learning_rate = 0.001;
num_classes = 5

# Load Data
dataset = DampDataset(dataframe = labels, root_dir = 'training-set-circle-6/images/', transform = damp_transform)
test_dataset = DampDataset(dataframe = test_labels, root_dir = 'training-set-circle-6/testing-images/', transform = damp_transform)

# Create the dataloaders
train_loader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=True)

# Loss and optimizer
criterion = nn.CrossEntropyLoss() 
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=16, kernel_size=5, padding=2),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.fc = nn.Linear(in_features=16 * 16 * 12, out_features=num_classes)
        
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

#instance of the Conv Net
cnn = CNN();

from torch.autograd import Variable

losses = [];
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = Variable(images.float())
        labels = Variable(labels)
        print(images.shape)
        # Forward + Backward + Optimize
        optimizer.zero_grad()
        outputs = cnn(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        losses.append(loss.data[0]);
        
        if (i+1) % 100 == 0:
            print ('Epoch : %d/%d, Iter : %d/%d,  Loss: %.4f' 
                   %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0]))

Error message:

torch.Size([100, 3, 28, 28])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-be43263879c0> in <module>()
      9         # Forward + Backward + Optimize
     10         optimizer.zero_grad()
---> 11         outputs = cnn(images)
     12         loss = criterion(outputs, labels)
     13         loss.backward()

4 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
   1672     if input.dim() == 2 and bias is not None:
   1673         # fused op is marginally faster
-> 1674         ret = torch.addmm(bias, input, weight.t())
   1675     else:
   1676         output = input.matmul(weight.t())

RuntimeError: size mismatch, m1: [100 x 1568], m2: [3072 x 5] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

ptrblck · October 29, 2020, 6:11pm

The shape mismatch error is raised in self.fc, since the number of features in the incoming activation doesn’t match the in_features in this layer.
You can add a print statement before passing out to self.fc:

        out = out.view(out.size(0), -1)
        print(out.shape)
        out = self.fc(out)

and would see that out has the shape [batch_size, 1568], while self.fc is defined with in_features=16*16*12=3072.

The in_features calculation is wrong, as it should be 7*7*32=1568.

Also, note that Variables are deprecated since PyTorch 0.4 so you can use tensors in newer versions.
Replace the .data[0] usage with item() as well, as it should now raise an error.