I am looking to batch multiply two matrices of dimensions torch.Size([7, 265, 768])
.
When I tried with torch.bmm(x,y)
I got the error like this,
RuntimeError: invalid argument 2: wrong matrix size, batch1: 265x768, batch2: 265x768 at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/TH/generic/THTensorMath.cpp:2312
Any idea what might be causing this?
I might be because u have to multiply a 256x768 by another matrix 768x N to get a 256xN matrix.
Actually both the matrices have same dimensions only. I checked their sizes by printing them.
torch.Size([7, 265, 768])
torch.Size([7, 265, 768])
That’s the problem… you cannot multiply those matrices
If you multiply a matrix you need a matrix
A: NxM
B: MxS
then A*B --> NxS
Thanks @JuanFMontesinos . I thought it would work like the same way in maths. So even for a 2D matrix of same dimensions, does it has to be squeezed / changed in dimension so that it can form the structure,
A: NxM
B: MxS
then A*B -->; NxS
or I am understanding it wrong.
I’m sorry but the way I cited is the way it works on math. The only change is that you are adding a 3rd dimension corresponding to the batch.
import torch
a = torch.rand(7,265,768)
b= torch.rand(7,768,500)
c=torch.bmm(a,b)
c.size()
Out[5]: torch.Size([7, 265, 500])
Alan_Yu
(Alan Yu)
August 25, 2022, 8:24pm
7
You need to take the transpose of the second matrix to make the dimensions match.
import torch
a = torch.rand(7,265,768)
b= torch.rand(7,265,768)
c = torch.matmul(a, b.transpose(-2, -1))
c.size()
torch.Size([7, 265, 265])
1 Like