How to batch multiply a 3D matrix?

I am looking to batch multiply two matrices of dimensions torch.Size([7, 265, 768]).

When I tried with torch.bmm(x,y) I got the error like this,

RuntimeError: invalid argument 2: wrong matrix size, batch1: 265x768, batch2: 265x768 at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/TH/generic/THTensorMath.cpp:2312

Any idea what might be causing this?

I might be because u have to multiply a 256x768 by another matrix 768x N to get a 256xN matrix.

Actually both the matrices have same dimensions only. I checked their sizes by printing them.

torch.Size([7, 265, 768])
torch.Size([7, 265, 768])

That’s the problem… you cannot multiply those matrices

If you multiply a matrix you need a matrix
A: NxM
B: MxS

then A*B --> NxS

Thanks @JuanFMontesinos. I thought it would work like the same way in maths. So even for a 2D matrix of same dimensions, does it has to be squeezed / changed in dimension so that it can form the structure,

A: NxM
B: MxS
then A*B -->; NxS

or I am understanding it wrong.

I’m sorry but the way I cited is the way it works on math. The only change is that you are adding a 3rd dimension corresponding to the batch.

import torch

a = torch.rand(7,265,768)

b= torch.rand(7,768,500)

c=torch.bmm(a,b)

c.size()
Out[5]: torch.Size([7, 265, 500])

You need to take the transpose of the second matrix to make the dimensions match.

import torch

a = torch.rand(7,265,768)
b= torch.rand(7,265,768)
c = torch.matmul(a, b.transpose(-2, -1))
c.size()
torch.Size([7, 265, 265])
1 Like