Matrix multiplication dimensions confusing

abhigenie92 · May 2, 2020, 9:25am

I am following this tutorial https://pytorch.org/tutorials/beginner/nlp/deep_learning_tutorial.html#example-logistic-regression-bag-of-words-classifier

nn.Linear(vocab_size, num_labels)
means that the matrix shape is num_labels x vocab_size

bow_vector dimensions is 1 x vocab_size and input expected for nn.linear is batch_size x features

Now, we are multiplying num_labels x vocab_size matrix by 1 x vocab_size. Thus, the dimensions don’t match for matrix multiplication. What am I missing here?

KFrank · May 2, 2020, 5:55pm

Hi Abhishek!

When you apply a Linear to a tensor, you are not exactly (left)
multiplying the Linear’s weight matrix onto the input tensor.
Rather, you right-multiplying the input tensor by the transpose
of the weight matrix. Thus the matrix-multiplication dimensions
match up properly.

Here is an illustrative script:

import torch
torch.__version__
torch.manual_seed (2020)

lin = torch.nn.Linear (3, 5, bias = False)
inp = torch.autograd.Variable (torch.randn (2, 3))

lin
lin.weight

lin (inp)
inp.matmul (lin.weight.transpose (0, 1))

And here is its output:

>>> import torch
>>> torch.__version__
'0.3.0b0+591e73e'
>>> torch.manual_seed (2020)
<torch._C.Generator object at 0x0000022F89356630>
>>>
>>> lin = torch.nn.Linear (3, 5, bias = False)
>>> inp = torch.autograd.Variable (torch.randn (2, 3))
>>>
>>> lin
Linear(in_features=3, out_features=5)
>>> lin.weight
Parameter containing:
-0.0152 -0.4558  0.1020
-0.4433 -0.0059 -0.2513
 0.1039  0.3586 -0.2873
 0.1510  0.0465  0.3502
 0.3212 -0.0039  0.4489
[torch.FloatTensor of size 5x3]

>>>
>>> lin (inp)
Variable containing:
-0.0324  0.2023 -0.1568  0.0838 -0.0041
-0.5776 -0.2743  0.2727  0.4004  0.4304
[torch.FloatTensor of size 2x5]

>>> inp.matmul (lin.weight.transpose (0, 1))
Variable containing:
-0.0324  0.2023 -0.1568  0.0838 -0.0041
-0.5776 -0.2743  0.2727  0.4004  0.4304
[torch.FloatTensor of size 2x5]

Best.

K. Frank