Dimension mismatch between data and nn.Linear tensors?

johnmwu · June 13, 2019, 2:36pm

I’m working through this tutorial, but this problem is slightly more general.

In the BoW model there, the input data tensor is of shape (1, 26), but the printed weight matrix (tensor) is of shape (2, 26). I am confused by this, because you cannot multiply a 1x26 matrix with a 2x26 matrix.

Of course, I assume PyTorch does something internally to make everything proper. However, my question is: why does this apparent dimension mismatch exist?

ptrblck · June 13, 2019, 3:45pm

The weight matrix is transposed as before applying the matrix multiplication (line of code), which makes the shapes compatible.

johnmwu · June 19, 2019, 3:03pm

Thanks for the answer. This clears up why it works, but it does not answer the original question.

Why does this dimension mismatch exist? What is the purpose? Is there some invariant that PyTorch wants to maintain (eg. the first axis of any data tensor should index data points). Why not just print the weight matrix as a 26x2 matrix?

ptrblck · June 19, 2019, 4:37pm

This topic might have the answer.