Conditional VAE Size mismatch

Hello all, I am new in this method, when I try to run it, I got an size mismatch error message.
Could someone help me for these two codes below please

main:

train = pd.read_csv("dataset/Train+.csv")
trainx, trainy = np.array(train[train.columns[train.columns != "class"]]), np.array(pd.get_dummies(train["class"]))
batch_size = 512
max_epoch = 100
train_N = len(train)
gpu = False
device = "cuda" if gpu else "cpu"
model = CVAE()
if gpu:
    model = model.cuda()
opt = optim.Adadelta(model.parameters(),lr = 1e-3)

def Loss_function(x_hat,x, mu,logsimga):
    reconstraction_loss = F.binary_cross_entropy(x_hat,x,size_average = False)
    KL_div = -0.5 * th.sum(1+logsimga-mu.pow(2) - logsimga.exp())
    return reconstraction_loss+KL_div
def create_batch(x,y):
    a = list(range(len(x)))
    np.random.shuffle(a)
    x = x[a]
    y = y[a]
    batch_x = [x[batch_size * i : (i+1)*batch_size,:].tolist() for i in range(len(x)//batch_size)]
    batch_y = [y[batch_size * i : (i+1)*batch_size].tolist() for i in range(len(x)//batch_size)]
    return batch_x, batch_y
def train():
    model.train()
    tr_loss = 0
    batch_x,batch_y = create_batch(trainx,trainy)
    for x,y in zip(batch_x,batch_y):
        opt.zero_grad()
        if gpu:
            x,y = V(th.Tensor(x).cuda()),V(th.Tensor(y).cuda())
        else:
            x,y = V(th.Tensor(x)),V(th.Tensor(y))
        x_hat,mu,logsigma = model(x,y)
        loss = Loss_function(x_hat,x,mu,logsigma)
        loss.backward()
        tr_loss += loss.item()
        opt.step()
    return tr_loss/train_N
tr_loss = []
for epoch in range(max_epoch):
    trl = train()
    tr_loss.append(trl)
    if epoch % 1 == 0:
        print(epoch,trl)

save = th.save(model.state_dict(), f"save_model/vae_adadelta_{max_epoch}.pth")
plt.plot(tr_loss)
plt.show()

model:

class CVAE(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1=nn.Linear(116,500)
        self.hidden = nn.Linear(512,505)
        self.mu = nn.Linear(505,25)
        self.sigma = nn.Linear(505,25)
        
        self.fc2 = nn.Linear(25,495)
        self.fc3 = nn.Linear(500,116)
        self.sigmoid = nn.Sigmoid()
        self.relu = nn.ReLU()
        
    def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = th.cat((h, y), dim = 1)
        h2 = self.hidden(self.relu(self.hidden(h1)))
        return self.mu(h2),self.sigma(h2)
    
    def revize_parameter(self,mu,logsigma):

        sigma = th.exp(0.5*logsigma)
        eps = V(th.randn(sigma.size()))
        return sigma.mul(eps) + mu

    
    def decoder(self,z,y):
        h = self.relu(self.fc2(z))

        h = th.cat((h,y),dim = 1)
        h = self.fc3(self.relu(self.hidden(h)))
        return self.sigmoid(h)
    
    def forward(self,x,y):
        mu,sigma = self.encoder(x, y)
        z = self.revize_parameter(mu,sigma)
        output = self.decoder(z,y)
        return output,mu,sigma

error:
builtins.RuntimeError: size mismatch, m1: [512 x 505], m2: [512 x 505]

Hy @Mi_Rak, the error is quite clear.

It’s just like how we do matrix multiplication (m * n) * (n * p) = m*p, where n has to be same for both matrix.
My guess,

  1. Either your input shape is not according to the structure of your model.

Or this cause the problem since if you pass the input of fc1 to hidden then it should be something like this.

self.fc1 = nn.Linear(116,512) # nn.Linear(input_channel,output_channel)
self.hidden = nn.Linear(512,505) # nn.Linear(input_channel,output_channel)

Hope this solves the problem.

1 Like

Hi @Usama_Hasan,
Thank you for your reply. I tryed your suggestion, but I got a new error mismatch in which I am not able to find the mistake
This error
size mismatch, m1: [512 x 505], m2: [512 x 505]
change to this
size mismatch, m1: [512 x 517], m2: [512 x 505]

@Mi_Rak
Can you share the whole stack trace of error:
Plus please share the input data shape

1 Like

@Usama_Hasan

Error:
builtins.RuntimeError: size mismatch, m1: [512 x 517], m2: [512 x 505] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

Input data:

<bound method DataFrame.info of         duration     src_bytes  ...  service_vmnet  service_whois
0       0.000000  3.558064e-07  ...              0              0
1       0.000000  1.057999e-07  ...              0              0
2       0.000000  0.000000e+00  ...              0              0
3       0.000000  1.681203e-07  ...              0              0
4       0.000000  1.442067e-07  ...              0              0
...          ...           ...  ...            ...            ...
125968  0.000000  0.000000e+00  ...              0              0
125969  0.000186  7.608895e-08  ...              0              0
125970  0.000000  1.616709e-06  ...              0              0
125971  0.000000  0.000000e+00  ...              0              0
125972  0.000000  1.094232e-07  ...              0              0

[125973 rows x 117 columns]>
tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 7.2466e-10, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [1.2958e-01, 1.0652e-07, 8.0157e-08,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        ...,
        [0.0000e+00, 3.0436e-08, 3.2063e-08,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 7.4785e-07, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00]])
tensor([[0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [1., 0., 0., 0., 0.],
        ...,
        [1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0.]])

Hello again, I guess we need to further break down your code to find the error.
first please confirm
The shape of x= (no_samples,116)?
The shape of y=(no_samples,1)?
If this is the case then.

Here in your forward pass, you concat the dimension of FC_1 layer ouput with Y which in my consideration is 1, that makes input for

to be 500+1 for the inner hidden layer.
My suggestion please break this down like this.

h1 = torch.cat((h, y), dim = 1)
#h1 check the output dim after concat.
temp_h1 = self.relu(self.hidden(h1))
#print(temp_h1.size()) to check the output dim.
h2 = self.hidden(temp_h1)
1 Like

Dear @Usama_Hasan,
Thank you so much for your help, really appreciate. Yes this is the shape.
The shape of x= (no_samples,116)?
The shape of y=(no_samples,1)?

As you suggestion

super().__init__()
        self.fc1=nn.Linear(116,512)
        self.hidden = nn.Linear(512,517)
        self.mu = nn.Linear(517,25)
        self.sigma = nn.Linear(517,25)
        
        self.fc2 = nn.Linear(25,495)
        self.fc3 = nn.Linear(500,116)
        self.sigmoid = nn.Sigmoid()
        self.relu = nn.ReLU()

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = th.cat((h, y), dim = 1)
        h2 = self.relu(self.hidden(h1))
        h3 = self.hidden(h2)
        return self.mu(h3),self.sigma(h3)

It still returnig the same error
builtins.RuntimeError: size mismatch, m1: [512 x 517], m2: [512 x 517] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

Thanks, this will make it easy to explain the problem with this code.
I ran your code with a random dataset of

size = (no_samples=100,116)

So in encoder function you first do.

h = self.relu(self.fc1(x))
print(h.size())
#torch.Size([100, 512]) See your fc1 one layer for this output

then

h1 = torch.cat((h, y), dim = 1)
print(h1.size())
#torch.Size([100, 513]) Output because we concat dim (500+1)

then

Now considering this.

self.hidden = nn.Linear(512,517) #  will take 512 as input dim

But In forward our output dim is 513 after torch.cat .

So you need to change your self.hidden input channel to 513. Moving forward in your function

This will output

Now here is the actual problem, you use the same self.hidden which receives 512 input dim and pass it 517 dim matrix, which cause another size mismatch error.
My advice create thoroughly check the size of our output tensor and create a few more hidden layer which can accept that output tensor.

1 Like

Also I don’t think this is neccessary, since you have used hidden layer already.
This will solve the problem.

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = torch.cat((h, y), dim = 1)
        temp_h1 = self.relu(self.hidden(h1))
        return self.mu(temp_h1),self.sigma(temp_h1)
    

Thank you so much for your advice. I will try to modify it as possible.

1 Like

Sure,
Try this thou.

1 Like

Hello again,

def encoder(self,x, y):
        
        h = self.relu(self.fc1(x))
        print(h.size())
        h1 = th.cat((h, y), dim = 1)
        print(h1.size())
        h2 = self.relu(self.hidden(h1))
        return self.mu(h2),self.sigma(h2)

print size result

h = self.relu(self.fc1(x))
        print(h.size())
#torch.Size([512, 512])

and

h1 = th.cat((h, y), dim = 1)
        print(h1.size())
#torch.Size([512, 517])

So confusing, when I try to do like this, there are no problem but in this case , no y condition in the encoder network. My objective is to condition also CVAE encoder and decoder by y

def encoder(self,x):
        h = self.relu(self.fc1(x))
        h1 = self.relu(self.hidden(h))
        h2 = self.hidden(h1)
        return self.mu(h2),self.sigma(h2)

the difference

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = th.cat((h, y), dim = 1)
        h2 = self.relu(self.hidden(h1))
        return self.mu(h2),self.sigma(h2)

This tells us that y=(no_samples,5) simple maths.

You don’t have to do this

That’s my whole point.
You need to understand your forward function.
Did you tried this, this will condition both your input and target

Yes I already try this but with the same error

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = torch.cat((h, y), dim = 1)
        temp_h1 = self.relu(self.hidden(h1))
        return self.mu(temp_h1),self.sigma(temp_h1)

builtins.RuntimeError: size mismatch, m1: [512 x 517], m2: [512 x 517] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

My misundertand is that the shape show that y = (no_sample, 1), but it returns (no_sample, 5)

Change this in your code.

 def __init__(self):
        super().__init__()
        self.fc1=nn.Linear(116,500)
        self.hidden = nn.Linear(505,500)
        self.mu = nn.Linear(500,25)
        self.sigma = nn.Linear(500,25)
        
        self.fc2 = nn.Linear(25,495)
        self.fc3 = nn.Linear(500,116)
        self.sigmoid = nn.Sigmoid()
        self.relu = nn.ReLU()

This caused the earlier error.

1 Like