Matrix Multiplication

How to do matrix multiplication on two matrices of sizes
torch.Size([32, 32, 3, 3]) and torch.Size([32, 64, 1, 1]) to get a matrix of size
torch.Size([32, 64, 3, 3]) ??

You can use einsum: torch.einsum — PyTorch 1.12 documentation. It is a very cool notation that, if understood, can let you do operations really efficiently.

# A = tensor of torch.Size([32, 32, 3, 3])
# B = tensor of torch.Size([32, 64, 1, 1]) 

C = torch.einsum("ijkl,jmno->imkl", A, B)

While the multiplication is indeed “possible” in terms of transpositions/broadcasting, I think we could use a bit more context to see if what he’s trying to do is correct. To me it doesn’t look like those two matrices “should” be multiplied (i.e. what does the operation itself represent?).

Thanks a lot. It worked. What’s the logic behind this?

In simple terms, you name each dimension of the tensors with a letter.

The einsum notation corresponds of two parts:

  1. the first one in which you specify the dimensions of each tensor separated by comma
  2. what comes after “->”, that specified the operation.

Of the tensors you have, assign the same letter to the dimensions that you want to multiply, and remove from the output the dimensions along which you want to accumulate.

E.g.
Element wise multiplication between two matrices: “ij,ij->ij”

Matrix multiplication:
“mn,np->mp” (multiply rows with columns (n) and accumulate (n))

In your example I have multiplied the dimension j and accumulated over j, n and o (since n and o are one-dimensional, you could reduce the number of letters and multiply those dimensions instead of accumulating them, this should be less efficient though).

In general you can also transpose the dimensions and do other fancy stuff with it.

You can find more information online.

Hope it was enough for an introduction.

1 Like

Thanks for such a detailed explanation.