Size mismatch error for tensors of equal size

In VAE code I am trying to run, I get the error below, where the mismatched tensors seem like the same size. The mismatch error is triggered from the code below that is trying to instantiate a Linear layer. Appreciate any pointers on how to debug

RuntimeError: size mismatch, m1: [94 x 4608], m2: [94 x 4608] at …/aten/src/TH/generic/THTensorMath.cpp:961

snippets from source code:
self.fc1 = nn.Linear(h_dim, z_dim)
mu = self.fc1(h)

For m1: [a * b], m2: [c * d], do ensure that b=c. To illustrate this point

import torch
import torch.nn as nn
batch_size = 12
fc1 = nn.Linear(12,6)
input = torch.randn(batch_size,6)
output = fc1(input)

will throw an error as shown below:

RuntimeError: size mismatch, m1: [12 x 6], m2: [12 x 6] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:136

To correct it, you would have to rewrite it as:

import torch
import torch.nn as nn
batch_size = 16 ### Any arbitarary number
fc1 = nn.Linear(12,6)
input = torch.randn(batch_size,12) ### Making sure the second dimension is qual to 12
output = fc1(input)

In your code, simply taking a transpose of ‘h’ should remove the RunTimeError.

Thx. Takes me past this barrier for certain!

@charan_Vjy I have done the same as you are saying, actually I am using two linear layers as given below

 self.fc1=nn.Linear(2048,246)
 self.fc2=nn.Linear(246,num_classes)

I am getting an error
size mismatch, m1: [2048 x 246], m2: [2048 x 246] at C:/cb/pytorch_1000000000000/work/aten/src\THC/generic/THCTensorMathBlas.cu:283
Is there any problem in code?
@ptrblck sir :slight_smile:

I guess you might be passing the input in a wrong shape to these layers, as your setup works fine:

num_classes = 10
fc1 = nn.Linear(2048, 246)
fc2 = nn.Linear(246, num_classes)
x = torch.randn(1, 2048)
out = fc1(x)
out = fc2(out)
print(out.shape)
> torch.Size([1, 10])
class Model1(torch.nn.Module):
     def __init__(self, num_classes=7):
        super(Model1, self).__init__()     
        self.cn1=nn.Conv1d(1,32,5,1)
        self.mp1=nn.MaxPool1d(4)
        self.cn2=nn.Conv1d(32,32,5,1)
        self.mp2=nn.MaxPool1d(4)
        self.cn3=nn.Conv1d(32,64,5,1)
        self.mp3=nn.MaxPool1d(4)
        self.cn4=nn.Conv1d(64,64,5,1)
        self.mp4=nn.MaxPool1d(2)
        self.cn5=nn.Conv1d(64,128,5,1)
        self.mp5=nn.MaxPool1d(2)
        self.cn6=nn.Conv1d(128,128,5,1)
        self.mp6=nn.MaxPool1d(2)
        
        self.fc1=nn.Linear(2048,246)
        self.fc2=nn.Linear(246,num_classes)
        
		
     def forward(self,x):   
        out=F.relu(self.cn1(x))
        out= self.mp1(out)
        out=F.relu(self.cn2(out))
        out= self.mp2(out)
        out=F.relu(self.cn3(out))
        out=self.mp3(out)
        out=F.relu(self.cn4(out)) 
        out=self.mp4(out)
        out=F.relu(self.cn5(out))
        out=self.mp5(out)
        out=F.relu(self.cn6(out))
        out=self.mp6(out)
        out=self.fc1(out)
        out=self.fc2(out)
       
        return out

@ptrblck sir this is the model and input size is (1,128000) vector.
This 2048 is correct , but the second linear layer is throwing same error everytime.

The error is raised, because you are not flattening the activation before feeding it into the linear layer.
Once it’s flattened via out.view(out.size(0), -1), you’ll hit a shape mismatch error and would need to change the in_features of self.fc1. This code should work:

class Model1(torch.nn.Module):
     def __init__(self, num_classes=7):
        super(Model1, self).__init__()     
        self.cn1=nn.Conv1d(1,32,5,1)
        self.mp1=nn.MaxPool1d(4)
        self.cn2=nn.Conv1d(32,32,5,1)
        self.mp2=nn.MaxPool1d(4)
        self.cn3=nn.Conv1d(32,64,5,1)
        self.mp3=nn.MaxPool1d(4)
        self.cn4=nn.Conv1d(64,64,5,1)
        self.mp4=nn.MaxPool1d(2)
        self.cn5=nn.Conv1d(64,128,5,1)
        self.mp5=nn.MaxPool1d(2)
        self.cn6=nn.Conv1d(128,128,5,1)
        self.mp6=nn.MaxPool1d(2)
        
        self.fc1=nn.Linear(31488,246)
        self.fc2=nn.Linear(246,num_classes)
        
		
     def forward(self,x):   
        out=F.relu(self.cn1(x))
        out= self.mp1(out)
        out=F.relu(self.cn2(out))
        out= self.mp2(out)
        out=F.relu(self.cn3(out))
        out=self.mp3(out)
        out=F.relu(self.cn4(out)) 
        out=self.mp4(out)
        out=F.relu(self.cn5(out))
        out=self.mp5(out)
        out=F.relu(self.cn6(out))
        out=self.mp6(out)
        out = out.view(out.size(0), -1)
        out=self.fc1(out)
        out=self.fc2(out)
       
        return out

model = Model1()
x = torch.randn(1, 1, 128000)
out = model(x)
1 Like

@ptrblck thank you so much sir, I think I really need to study this shape and size theory in models :frowning: