RuntimeError: : mat1 and mat2 shapes cannot be multiplied

Hi,
I am new to PyTorch and Deep Learning in general. I created this class and I try to run something.

class StudentModel(nn.Module):
    def __init__(self, num_classes=k):
        super(StudentModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(64 , 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, num_classes)
 
    def forward(self, x):
        out = self.conv1(x)
        out = self.relu1(out)
        out = self.conv2(out)
        out = self.relu2(out)
        out = self.pool(out)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        out = self.relu3(out)
        out = self.fc2(out)
        return out

Unfortunately, I get this error and I do not understand why, since ‘out_features’ is equal to ’ ‘in_features’ of the next hidden-layer.

RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x16384 and 64x128)

The error message does not match your model architecture, since self.fc1 uses 4096 input features, while self.fc2 uses 128.
Could you check if you copied the wrong model, as it’s also failing in the super() call since a wrong class name is used?

I edited the question. I still get the error. I will show you below:

RuntimeError                              Traceback (most recent call last)
Cell In[81], line 107
    104 student_model = StudentModel()
    106 #get inference time for student network
--> 107 get_model_inference_time(student_model)
    109 student_model.to(device)
    111 student_model_weights_path = 'student_model.pt'

Cell In[76], line 16, in get_model_inference_time(model)
     13 input = torch.rand((128, 3, 32, 32))
     15 start = time.time()
---> 16 out = model(input)
     17 end = time.time()
     18 assert out.shape[0] == 128

File /opt/jlab-env/lib/python3.11/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

Cell In[81], line 79, in StudentModel.forward(self, x)
     77 out = self.pool(out)
     78 out = out.view(out.size(0), -1)
---> 79 out = self.fc1(out)
     80 out = self.relu3(out)
     81 out = self.fc2(out)

File /opt/jlab-env/lib/python3.11/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/jlab-env/lib/python3.11/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x16384 and 64x128)```

Correct the in_features of self.fc1 and it should work:

self.fc1 = nn.Linear(16384 , 128)

Now it works. I changed the parameters because I had a constraint regarding the weight.
By the way, are the layers I am using good if my goal is accuracy? I am dealing with some pictures and I measure the accuracy using f1_score and accuracy_score from sklearn.

Thank you very much.

The model looks fine and is a small CNN. Depending on your use case you might want to compare it against a larger model, e.g. a resnet.