Conditional VAE Size mismatch

Mi_Rak · August 23, 2020, 6:13am

Hello all, I am new in this method, when I try to run it, I got an size mismatch error message.
Could someone help me for these two codes below please

main:

train = pd.read_csv("dataset/Train+.csv")
trainx, trainy = np.array(train[train.columns[train.columns != "class"]]), np.array(pd.get_dummies(train["class"]))
batch_size = 512
max_epoch = 100
train_N = len(train)
gpu = False
device = "cuda" if gpu else "cpu"
model = CVAE()
if gpu:
    model = model.cuda()
opt = optim.Adadelta(model.parameters(),lr = 1e-3)

def Loss_function(x_hat,x, mu,logsimga):
    reconstraction_loss = F.binary_cross_entropy(x_hat,x,size_average = False)
    KL_div = -0.5 * th.sum(1+logsimga-mu.pow(2) - logsimga.exp())
    return reconstraction_loss+KL_div
def create_batch(x,y):
    a = list(range(len(x)))
    np.random.shuffle(a)
    x = x[a]
    y = y[a]
    batch_x = [x[batch_size * i : (i+1)*batch_size,:].tolist() for i in range(len(x)//batch_size)]
    batch_y = [y[batch_size * i : (i+1)*batch_size].tolist() for i in range(len(x)//batch_size)]
    return batch_x, batch_y
def train():
    model.train()
    tr_loss = 0
    batch_x,batch_y = create_batch(trainx,trainy)
    for x,y in zip(batch_x,batch_y):
        opt.zero_grad()
        if gpu:
            x,y = V(th.Tensor(x).cuda()),V(th.Tensor(y).cuda())
        else:
            x,y = V(th.Tensor(x)),V(th.Tensor(y))
        x_hat,mu,logsigma = model(x,y)
        loss = Loss_function(x_hat,x,mu,logsigma)
        loss.backward()
        tr_loss += loss.item()
        opt.step()
    return tr_loss/train_N
tr_loss = []
for epoch in range(max_epoch):
    trl = train()
    tr_loss.append(trl)
    if epoch % 1 == 0:
        print(epoch,trl)

save = th.save(model.state_dict(), f"save_model/vae_adadelta_{max_epoch}.pth")
plt.plot(tr_loss)
plt.show()

model:

class CVAE(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1=nn.Linear(116,500)
        self.hidden = nn.Linear(512,505)
        self.mu = nn.Linear(505,25)
        self.sigma = nn.Linear(505,25)
        
        self.fc2 = nn.Linear(25,495)
        self.fc3 = nn.Linear(500,116)
        self.sigmoid = nn.Sigmoid()
        self.relu = nn.ReLU()
        
    def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = th.cat((h, y), dim = 1)
        h2 = self.hidden(self.relu(self.hidden(h1)))
        return self.mu(h2),self.sigma(h2)
    
    def revize_parameter(self,mu,logsigma):

        sigma = th.exp(0.5*logsigma)
        eps = V(th.randn(sigma.size()))
        return sigma.mul(eps) + mu

    
    def decoder(self,z,y):
        h = self.relu(self.fc2(z))

        h = th.cat((h,y),dim = 1)
        h = self.fc3(self.relu(self.hidden(h)))
        return self.sigmoid(h)
    
    def forward(self,x,y):
        mu,sigma = self.encoder(x, y)
        z = self.revize_parameter(mu,sigma)
        output = self.decoder(z,y)
        return output,mu,sigma

error:
builtins.RuntimeError: size mismatch, m1: [512 x 505], m2: [512 x 505]

Usama_Hasan · August 23, 2020, 9:23am

Hy @Mi_Rak, the error is quite clear.

It’s just like how we do matrix multiplication (m * n) * (n * p) = m*p, where n has to be same for both matrix.
My guess,

Either your input shape is not according to the structure of your model.

Or this cause the problem since if you pass the input of fc1 to hidden then it should be something like this.

self.fc1 = nn.Linear(116,512) # nn.Linear(input_channel,output_channel)
self.hidden = nn.Linear(512,505) # nn.Linear(input_channel,output_channel)

Hope this solves the problem.

Mi_Rak · August 23, 2020, 10:38am

Hi @Usama_Hasan,
Thank you for your reply. I tryed your suggestion, but I got a new error mismatch in which I am not able to find the mistake
This error
size mismatch, m1: [512 x 505], m2: [512 x 505]
change to this
size mismatch, m1: [512 x 517], m2: [512 x 505]

Usama_Hasan · August 23, 2020, 10:42am

@Mi_Rak
Can you share the whole stack trace of error:
Plus please share the input data shape

Mi_Rak · August 23, 2020, 10:48am

@Usama_Hasan

Error:
builtins.RuntimeError: size mismatch, m1: [512 x 517], m2: [512 x 505] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

Input data:

<bound method DataFrame.info of         duration     src_bytes  ...  service_vmnet  service_whois
0       0.000000  3.558064e-07  ...              0              0
1       0.000000  1.057999e-07  ...              0              0
2       0.000000  0.000000e+00  ...              0              0
3       0.000000  1.681203e-07  ...              0              0
4       0.000000  1.442067e-07  ...              0              0
...          ...           ...  ...            ...            ...
125968  0.000000  0.000000e+00  ...              0              0
125969  0.000186  7.608895e-08  ...              0              0
125970  0.000000  1.616709e-06  ...              0              0
125971  0.000000  0.000000e+00  ...              0              0
125972  0.000000  1.094232e-07  ...              0              0

[125973 rows x 117 columns]>
tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 7.2466e-10, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [1.2958e-01, 1.0652e-07, 8.0157e-08,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        ...,
        [0.0000e+00, 3.0436e-08, 3.2063e-08,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 7.4785e-07, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00]])
tensor([[0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [1., 0., 0., 0., 0.],
        ...,
        [1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0.]])

Usama_Hasan · August 23, 2020, 11:31am

Hello again, I guess we need to further break down your code to find the error.
first please confirm
The shape of x= (no_samples,116)?
The shape of y=(no_samples,1)?
If this is the case then.

Here in your forward pass, you concat the dimension of FC_1 layer ouput with Y which in my consideration is 1, that makes input for

to be 500+1 for the inner hidden layer.
My suggestion please break this down like this.

h1 = torch.cat((h, y), dim = 1)
#h1 check the output dim after concat.
temp_h1 = self.relu(self.hidden(h1))
#print(temp_h1.size()) to check the output dim.
h2 = self.hidden(temp_h1)

Mi_Rak · August 23, 2020, 11:45am

Dear @Usama_Hasan,
Thank you so much for your help, really appreciate. Yes this is the shape.
The shape of x= (no_samples,116)?
The shape of y=(no_samples,1)?

As you suggestion

super().__init__()
        self.fc1=nn.Linear(116,512)
        self.hidden = nn.Linear(512,517)
        self.mu = nn.Linear(517,25)
        self.sigma = nn.Linear(517,25)
        
        self.fc2 = nn.Linear(25,495)
        self.fc3 = nn.Linear(500,116)
        self.sigmoid = nn.Sigmoid()
        self.relu = nn.ReLU()

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = th.cat((h, y), dim = 1)
        h2 = self.relu(self.hidden(h1))
        h3 = self.hidden(h2)
        return self.mu(h3),self.sigma(h3)

It still returnig the same error
builtins.RuntimeError: size mismatch, m1: [512 x 517], m2: [512 x 517] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

Usama_Hasan · August 23, 2020, 12:05pm

Thanks, this will make it easy to explain the problem with this code.
I ran your code with a random dataset of

size = (no_samples=100,116)

So in encoder function you first do.

h = self.relu(self.fc1(x))
print(h.size())
#torch.Size([100, 512]) See your fc1 one layer for this output

then

h1 = torch.cat((h, y), dim = 1)
print(h1.size())
#torch.Size([100, 513]) Output because we concat dim (500+1)

then

Now considering this.

self.hidden = nn.Linear(512,517) #  will take 512 as input dim

But In forward our output dim is 513 after torch.cat .

So you need to change your self.hidden input channel to 513. Moving forward in your function

This will output

Now here is the actual problem, you use the same self.hidden which receives 512 input dim and pass it 517 dim matrix, which cause another size mismatch error.
My advice create thoroughly check the size of our output tensor and create a few more hidden layer which can accept that output tensor.

Usama_Hasan · August 23, 2020, 12:08pm

Also I don’t think this is neccessary, since you have used hidden layer already.
This will solve the problem.

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = torch.cat((h, y), dim = 1)
        temp_h1 = self.relu(self.hidden(h1))
        return self.mu(temp_h1),self.sigma(temp_h1)

Mi_Rak · August 23, 2020, 12:12pm

Thank you so much for your advice. I will try to modify it as possible.

Usama_Hasan · August 23, 2020, 12:17pm

Sure,
Try this thou.

Usama_Hasan:

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = torch.cat((h, y), dim = 1)
        temp_h1 = self.relu(self.hidden(h1))
        return self.mu(temp_h1),self.sigma(temp_h1)

Mi_Rak · August 23, 2020, 12:36pm

Hello again,

def encoder(self,x, y):
        
        h = self.relu(self.fc1(x))
        print(h.size())
        h1 = th.cat((h, y), dim = 1)
        print(h1.size())
        h2 = self.relu(self.hidden(h1))
        return self.mu(h2),self.sigma(h2)

print size result

h = self.relu(self.fc1(x))
        print(h.size())
#torch.Size([512, 512])

and

h1 = th.cat((h, y), dim = 1)
        print(h1.size())
#torch.Size([512, 517])

So confusing, when I try to do like this, there are no problem but in this case , no y condition in the encoder network. My objective is to condition also CVAE encoder and decoder by y

def encoder(self,x):
        h = self.relu(self.fc1(x))
        h1 = self.relu(self.hidden(h))
        h2 = self.hidden(h1)
        return self.mu(h2),self.sigma(h2)

the difference

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = th.cat((h, y), dim = 1)
        h2 = self.relu(self.hidden(h1))
        return self.mu(h2),self.sigma(h2)

Usama_Hasan · August 23, 2020, 12:42pm

This tells us that y=(no_samples,5) simple maths.

You don’t have to do this

That’s my whole point.
You need to understand your forward function.
Did you tried this, this will condition both your input and target

Usama_Hasan:

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = torch.cat((h, y), dim = 1)
        temp_h1 = self.relu(self.hidden(h1))
        return self.mu(temp_h1),self.sigma(temp_h1)

Mi_Rak · August 23, 2020, 12:53pm

Yes I already try this but with the same error

def encoder(self,x, y):
        h = self.relu(self.fc1(x))
        h1 = torch.cat((h, y), dim = 1)
        temp_h1 = self.relu(self.hidden(h1))
        return self.mu(temp_h1),self.sigma(temp_h1)

builtins.RuntimeError: size mismatch, m1: [512 x 517], m2: [512 x 517] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

My misundertand is that the shape show that y = (no_sample, 1), but it returns (no_sample, 5)

Usama_Hasan · August 23, 2020, 1:30pm

Change this in your code.

 def __init__(self):
        super().__init__()
        self.fc1=nn.Linear(116,500)
        self.hidden = nn.Linear(505,500)
        self.mu = nn.Linear(500,25)
        self.sigma = nn.Linear(500,25)
        
        self.fc2 = nn.Linear(25,495)
        self.fc3 = nn.Linear(500,116)
        self.sigmoid = nn.Sigmoid()
        self.relu = nn.ReLU()

This caused the earlier error.