RuntimeError: size mismatch, m1: [192 x 68], m2: [1024 x 68] at /opt/conda/conda-bld/pytorch_/work/aten/src/THC/generic/THCTensorMathBlas.cu:268

I’m getting a size mismatch error that I can’t understand.

(Pdb) self.W_di
Linear(in_features=68, out_features=1024, bias=True)
(Pdb) indices.size()
torch.Size([32, 6, 68])
(Pdb) self.W_di(indices)
*** RuntimeError: size mismatch, m1: [192 x 68], m2: [1024 x 68] at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THC/generic/THCTensorMathBlas.cu:268

Why is there a mismatch?
Maybe because of the way I defined the weight in forward (instead of _init_)?

This is how I defined self.W_di:

def forward(self):

    if self.W_di is None:
        self.W_di_weight = nn.Parameter(torch.randn(mL_n * 2,1024).to(device))
        self.W_di_bias = nn.Parameter(torch.ones(1024).to(device))                  

    self.W_di = nn.Linear(mL_n * 2, 1024)
    self.W_di.weight = self.W_di_weight
    self.W_di.bias = self.W_di_bias

    result = self.W_di(indices)

Any pointer would be highly appreciated!

Hi,

I have tested your code and error is about your weight initialization. Actually, you have to transpose the weight you have considered or reverse sizes.
but the conventional way of initializing weight or std is using the torch.nn.init module.

Here is your code using this convention, and it works:


import torch
import torch.nn as nn

w_di = nn.Linear(68, 1024, bias= True).cuda()
indices = torch.randn(32, 6, 68).cuda()
torch.nn.init.normal_(w_di.weight)
torch.nn.init.constant_(w_di.bias, 1.0)
w_di(indices)

And this is your updated code:


import torch
import torch.nn as nn

w_di = nn.Linear(68, 1024, bias= True).cuda()
indices = torch.randn(32, 6, 68).cuda()
w_di.weight = nn.Parameter(torch.randn(34 * 2,1024).t().cuda())
w_di.bias = nn.Parameter(torch.ones(1024).cuda())  
w_di(indices)

Bests
Nik

1 Like