I’m working through this tutorial, but this problem is slightly more general.

In the BoW model there, the input data tensor is of shape (1, 26), but the printed weight matrix (tensor) is of shape (2, 26). I am confused by this, because you cannot multiply a 1x26 matrix with a 2x26 matrix.

Of course, I assume PyTorch does something internally to make everything proper. However, my question is: why does this apparent dimension mismatch exist?

Thanks for the answer. This clears up why it works, but it does not answer the original question.

Why does this dimension mismatch exist? What is the purpose? Is there some invariant that PyTorch wants to maintain (eg. the first axis of any data tensor should index data points). Why not just print the weight matrix as a 26x2 matrix?