RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x13056 and 153600x2048)

gquelo · June 7, 2022, 7:24am

Hi, thank you for helping me!

In fact, I don’t understand why I have to put 32. Moreover, it doesn’t work because when I change input_size by 32, I have the following error :

RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x4 and 32x24)

So I would have to change 32 by 4, but it’s the initial issue.
I do not understand why I have this now. My input tensor is a 150x4 :

gquelo · June 7, 2022, 9:04am

I just found the error. In fact, it’s in the convert function. I would have put ‘input_size’ for the last dim instead of 32.

Proxitor · June 8, 2022, 3:25pm

Hey all - I am getting a similar error. Can anybody see my mistake?

input_shape = (120, 120,3)
num_classes = 7

class CNN(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.main = torch.nn.Sequential(
            torch.nn.Conv2d(3, 32, 3, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(32, 64, 3, padding=1),
            torch.nn.ReLU(),
            torch.nn.AvgPool2d(2),
            torch.nn.Dropout(0,25),
            torch.nn.Conv2d(64, 64, 3, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(64, 64, 3, padding=1),
            torch.nn.ReLU(),
            torch.nn.AvgPool2d(2),
            torch.nn.Dropout(0, 25),
            torch.nn.Conv2d(64, 64, 3, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(64, 64, 3, padding=1),
            torch.nn.ReLU(),
            torch.nn.AvgPool2d(2),
            torch.nn.Dropout(0, 25),
            torch.nn.Flatten(),
            torch.nn.Linear(64,128),
            torch.nn.ReLU(),
            torch.nn.Dropout(0.25),
            torch.nn.Linear(128, num_classes),
            torch.nn.Softmax()
        )

    def forward(self, x):
        out = self.main(x)
        return out

model = CNN()
print(summary(model, (3, 120, 120)))

RuntimeError: mat1 and mat2 shapes cannot be multiplied (2x14400 and 64x128)

Input are RGB pictures, 120x120px

ptrblck · June 8, 2022, 7:19pm

The error is raised in the first linear layer as it expects 64 input features while the input activation has 14400 features.
Use:

...
torch.nn.Dropout(0, 25),
torch.nn.Flatten(),
torch.nn.Linear(14400, 128),
torch.nn.ReLU(),
...

and it should work.

Also, in case you are using nn.CrossEntropyLoss as the loss function, remove the Softmax layer at the end as raw logits are expected.

spongebob · June 14, 2022, 8:36pm

Hello i have this error in Python. Can somebody help me and explain what is the error ? I would like to understand. Input images 943x900.

spongebob · June 14, 2022, 8:36pm

And I have this error

ptrblck · June 14, 2022, 11:02pm

The shape mismatch is raised in the first linear layer of self.fc as it’s expecting an input activation with 15552 features while the actual activation has 634800 features. Change the in_features of this linear layer to 634800 and it should work.

PS: you can post code snippets by wrapping them into three backticks ```, which would make debugging easier.

spongebob · June 14, 2022, 11:22pm

Thank you for your answer ! I try it now. How do you know the number of features that your cnn needs in your architecture ? sorry if my english is not very good ^^

spongebob · June 14, 2022, 11:30pm

I have the same question for 15552. My program crashed beacause i don’t have enough RAM. I would change my image in 400x400

ptrblck · June 14, 2022, 11:39pm

You could either:

calculate the number of features manually using the desired input shape as well as the model architecture (might be tedious especially if the model architecture is not trivial)
execute a forward pass and print the activation shape before its passed to the first linear layer (it would crash in the first run, you could then set the desired in_features, and start the training)
use the nn.Lazy* modules which would initialize the number of input features for you using the passed input

If you are running out of memory, try to decrease the batch size or the spatial size of the input, use more aggressive pooling, or generally change the model architecture to use smaller activations and less parameters.

spongebob · June 15, 2022, 12:39am

the problem is that when I reduce the size of my images I lose information on specific pixels whose coordinates I extract because they have a specific color during extraction. I send the network an input image with the list of coordinates of the pixels whose position I want it to learn. I tried to reduce the batch_size and the architecture of the network it just takes an infinite time. I hope you understand my goal, I send it an input image and a vector of pixel coordinates and the network has to make the link. It has to place the points of these specific pixels on the input image.

Baptiste_Leclerc · July 11, 2022, 11:27am

Hello, I’m currently facing a similar issue with my code.
I’m running FNN and my input tensor shape is [1750,5] and output tensor [1750,1]

Here my code :

train_input = df.iloc[:, 0:5].values.astype('float32')
print(train_input.shape)
ti = torch.tensor(train_input)
print("ti shape",ti.shape)

train_output = df.iloc[:, 38].values.astype('float32')
to = torch.tensor(train_output)
to = to.view(1750,1)
print("to.shape",to.shape)

#Training and Validation Split
ti, val_i, to, val_o = train_test_split(ti, to, random_state=2020,test_size=0.2)



class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        tch.manual_seed(2020)
        self.fc1 = nn.Linear(5, 10)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(10, 1)
        self.final = nn.ReLU()

    def forward(self, x):
        op = self.fc1(x)
        op = self.relu1(op)
        op = self.fc2(op)
        y = self.final(op)
        return y

def train_network(model, optimizer, loss_function, num_epochs, batch_size, ti, to):
    # Explicitly start model training
    model.train()

    loss_across_epochs = []
    for epochs in range(num_epochs):
        train_loss = 0.0

        for i in range(0, ti.shape[0], batch_size):
            # Extract train batch from X  and Y
            input_data = ti [i:min(ti.shape[0], i + batch_size)]
            #print(input_data)
            labels = to [i:min(to.shape[0], i + batch_size)]
            # print(to)

            # set the gradients to zero before starting to do backpropragation
            optimizer.zero_grad()
            # Forward pass
            output_data = model(input_data)
            # print("in", input_data)
            # print("out" ,output_data)
            # Caculate loss
            loss = loss_function(output_data, labels)
            # Backpropogate
            loss.backward()

            # Update weights
            optimizer.step()

            train_loss += loss.item() * batch_size

        print("Epoch: {} - Loss:{:.4f}".format(epochs + 1, train_loss))
        loss_across_epochs.extend([train_loss])

    #Predict
    y_test_pred = model(val_o)
    a = np.where(y_test_pred > 0.5, 1, 0)
    return loss_across_epochs



# Create an object of the Neural Network class
model = NeuralNetwork()
# Define loss function
loss_function = nn.CrossEntropyLoss()  #  Cross Entropy Loss
# Define Optimizer
adam_optimizer = tch.optim.Adam(model.parameters(), lr=0.001)
# Define epochs and batch size
num_epochs = 12
batch_size = 5

# Calling the function for training and pass model, optimizer, loss and related parameters
adam_loss = train_network(model, adam_optimizer, loss_function, num_epochs, batch_size, ti, to)

When running the code, part of the script and an error message is display:

(1750, 5)
ti shape torch.Size([1750, 5])
to.shape torch.Size([1750, 1])
Epoch: 1 - Loss:0.0000
Epoch: 2 - Loss:0.0000
Epoch: 3 - Loss:0.0000
Epoch: 4 - Loss:0.0000
Epoch: 5 - Loss:0.0000
Epoch: 6 - Loss:0.0000
Epoch: 7 - Loss:0.0000
Epoch: 8 - Loss:0.0000
Epoch: 9 - Loss:0.0000
Epoch: 10 - Loss:0.0000
Epoch: 11 - Loss:0.0000
Epoch: 12 - Loss:0.0000


Traceback (most recent call last):
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-c6c09d76d5ce>", line 1, in <cell line: 1>
    runfile('/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py', wdir='/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project')
  File "/home2/baptiste/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/221.5080.212/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home2/baptiste/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/221.5080.212/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py", line 101, in <module>
    adam_loss = train_network(model, adam_optimizer, loss_function, num_epochs, batch_size, ti, to)
  File "/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py", line 84, in train_network
    y_test_pred = model(val_o)
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py", line 40, in forward
    op = self.fc1(x)
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (350x1 and 5x10)

I’m stuck and don’t know what to do to get rid of this error. I’ve read all the message from this post and try to adapt my [batch_size, 5] and as well my in_features to 5 therefore they match but nothing work as plan …

Any tips ?

ptrblck · July 11, 2022, 5:02pm

The error is raised on:

y_test_pred = model(val_o)

and I think you are passing the validation targets instead of the inputs to the model, so swap it for val_i.

Baptiste_Leclerc · July 12, 2022, 10:47am

Hello, thank you a lot for your answer, after the modification the error message has disappeared.

However I’m facing a new issue. Now I want to had extra layer to my FNN, and the issue with RuntimeError: mat1 and mat2 shapes cannot be multiplied persist…

Here a new sketch of my code :

train_input = df.iloc[:, 0:5].values.astype('float32')
print(train_input.shape)
ti = torch.tensor(train_input)
print("ti shape",ti.shape)

train_output = df.iloc[:, 38].values.astype('float32')
to = torch.tensor(train_output)
to = to.view(1750,1)
print("to.shape",to.shape)

#Training and Validation Split
ti, val_i, to, val_o = train_test_split(ti, to, random_state=2020,test_size=0.2)

class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        tch.manual_seed(2020)
        self.fc1 = nn.Linear(5,10)
        self.fc2 = nn.Linear(10,20)
        self.fc3 = nn.Linear(20,1)
        self.relu1 = nn.ReLU()
        self.final = nn.Sigmoid()

    def forward(self, x):
        op = self.fc1(x)
        op = self.relu1(op)
        op = self.fc2(op)
        op = self.relu1(op)
        op = self.fc3(x)
        op = self.relu1(op)
        y = self.final(op)
        return y

def train_network(model, optimizer, loss_function, num_epochs, batch_size, ti, to):
    # Explicitly start model training
    model.train()

    loss_across_epochs = []
    for epochs in range(num_epochs):
        train_loss = 0.0

        for i in range(0, ti.shape[0], batch_size):

            # Extract train batch from X  and Y
            input_data = ti [i:min(ti.shape[0], i + batch_size)]
            

            labels = to [i:min(to.shape[0], i + batch_size)]
         

            # set the gradients to zero before starting to do backpropragation
            optimizer.zero_grad()

            # Forward pass
            output_data = model(input_data)
            
            # Caculate loss
            loss = loss_function(output_data, labels)

            # Backpropogate
            loss.backward()

            # Update weights
            optimizer.step()

            train_loss += loss.item() * batch_size

        print("Epoch: {} - Loss:{:.4f}".format(epochs + 1, train_loss))
        loss_across_epochs.extend([train_loss])

    #Predict
    y_test_pred = model(val_i)
    a = np.where(y_test_pred>=0,1, 0)
    return loss_across_epochs



# Create an object of the Neural Network class
model = NeuralNetwork()

# Define loss function
loss_function = nn.MSELoss() # Squared Error

# Define Optimizer
adam_optimizer = tch.optim.Adam(model.parameters(), lr=0.001)

# Define epochs and batch size
num_epochs = 10
batch_size =1

# Calling the function for training and pass model, optimizer, loss and related parameters
adam_loss = train_network(model, adam_optimizer, loss_function, num_epochs, batch_size, ti, to)

The error message is still the same:

(1750, 5)
ti shape torch.Size([1750, 5])
to.shape torch.Size([1750, 1])
Traceback (most recent call last):
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-c6c09d76d5ce>", line 1, in <cell line: 1>
    runfile('/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py', wdir='/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project')
  File "/home2/baptiste/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/221.5080.212/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home2/baptiste/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/221.5080.212/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py", line 115, in <module>
    adam_loss = train_network(model, adam_optimizer, loss_function, num_epochs, batch_size, ti, to)
  File "/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py", line 76, in train_network
    output_data = model(input_data)
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home2/baptiste/PycharmProjects/pythonProject/pythonProject1/PV project/Test FNN 1.py", line 46, in forward
    op = self.fc3(x)
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home2/baptiste/anaconda3/envs/pythonProject1/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x5 and 20x1)

Why when I had one extra layer, everything goes wrong, furthermore I’m getting confuse with the correct Batch size that i should choose as well as this part in my code:

#Predict
    y_test_pred = model(val_i)
    a = np.where(y_test_pred>=0,1, 0)
    return loss_across_epoch

which is really not clear…

Thanks in advance for any tips or subjection that you could made

ptrblck · July 12, 2022, 4:27pm

You are passing the input to self.fc3 while I guess you wanted to pass op to it:

        op = self.relu1(op)
        op = self.fc3(x) # !!!

Harry-KIT · July 19, 2022, 7:37am

Hi @ptrblck
I saw there are tons of information about this error. But i couldnt find solution to my case. I am using pretrained resnext to image classification but it gives the same errors with above cases. it works some pretrained models like resnet, vgg and so on but resnext.

Harry-KIT · July 19, 2022, 7:39am

it is weird that some models work but others dont

ptrblck · July 19, 2022, 8:02am

It seems you are using a single linear layer and are replacing the resnext50_32x4d.
I assume you want to replace the model.fc layer only instead of the entire model.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier.

Coda · July 19, 2022, 4:59pm

hello @ptrblck
I’m facing a similar problem when create my Alexnet. Here’s my note and error message, and I want to know how to solve the problem. Thank you for you replication !

class AlexNet(nn.Module):
    def __init__(self, num_classes=2, init_weights=False):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=3, stride=3, padding=2),  # input[3, 32, 32]  output[96, 12, 12]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=1),  # output[96, 10, 10]
            nn.BatchNorm2d(96),
            nn.Conv2d(96, 256, kernel_size=5, padding=2),  # output[256, 10, 10]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=1),  # output[256, 8, 8]
            nn.BatchNorm2d(256),
            nn.Conv2d(256, 384, kernel_size=3, padding=1),  # output[384, 8, 8]
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 384, kernel_size=3, padding=1),  # output[384, 8, 8]
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),  # output[256, 8, 8]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=1),  # output[256, 6, 6],
        )
        self.classifier = torch.nn.Sequential(
            nn.Linear(9216, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, num_classes),
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, start_dim=1)
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

And here’s my error message.

ptrblck · July 19, 2022, 5:00pm

Use in_features=73984 in the first linear layer of self.classifier and it should work.