Cannot matmul with 3D tensor `2D @ 3D`

The result of [3, 3] shaped tensor matmul with [3, 16, 1080] was expected to yield [3, 16, 1080] shaped tensor.
However, I get a runtime error.

Is this the wrong implementation? Or am I missing something?

import numpy as np
import torch

X = torch.from_numpy(np.random.randn(3,3)).cuda().float()
Y = torch.from_numpy(np.random.randn(3,16,1080)).cuda().float()
print(X.shape, Y.shape)
print(X.dtype, Y.dtype)
X @ Y
torch.Size([3, 3]) torch.Size([3, 16, 1080])
torch.float32 torch.float32
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-14-c2c71b1687e7> in <module>
      6 print(X.shape, Y.shape)
      7 print(X.dtype, Y.dtype)
----> 8 X @ Y

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3240x16 and 3x3)

Hi tsp!

torch.matmul() (@) doesn’t perform the matrix-tensor multiplication you
might expect. Instead, for these tensors, it tried to perform “batch matrix
multiplication” (broadcasting the batch dimensions), and the dimensions
don’t line up properly.

Probably the simplest approach is to use einsum():

>>> import torch
>>> print (torch.__version__)
1.12.0
>>>
>>> _ = torch.manual_seed (2022)
>>>
>>> X = torch.randn (3, 3)
>>> Y = torch.randn (3, 16, 1080)
>>>
>>> prod = torch.einsum ('ij,jkl->ikl', X, Y)
>>>
>>> prod.shape
torch.Size([3, 16, 1080])

Best.

K. Frank

1 Like