I am trying to get a matrix vector multiply over a batch of vector inputs.
Given:
# (batch x inp)
v = torch.randn(5, 15)
# (inp x output)
M = torch.randn(15, 20)
Compute:
# (batch x output)
out = torch.Tensor(5, 20)
for i, batch_v in enumerate(v):
out[i] = (batch_v * M).t()
But (i) multiplication seems to expect both inputs with equal dimensions resulting in a RuntimeError: inconsistent tensor size at /home/enrique/code/vendor/pytorch/torch/lib/TH/generic/THTensorMath.c:623, and (2) I’d be cool if I didn’t have to loop over row vectors.
* is elementwise multiplication, if you’re using Python3 you can use @ operator as matrix-vector and matrix-matrix multiplication. Otherwise you can always resort to batch_v.mm(M) (since batched vector matrix is just matrix-matrix multiplication).
Thanks for the hints.
Indeed, I can solve this by just using matrix-matrix mulptiplies (I got confused with the batch dim), which is what my code was unknowingly trying to accomplish.
Just for the record, @ is actually only for versions > 3.4.
I was wondering… Matrix batched-vector multiply is just matrix-matrix multiply, but what about the inverse situation: Batched-matrix vector multiply?
Given a batched-matrix M (batch dim1 x dim2) and a vector v (dim2), is there something like a bmv method, similar to bmm that would give you (batch x dim1)?
No, there’s not. But you could do it by reshaping your vector to look like a matrix (no memory copy there, just stride tricks):
M_batch = ... # batch x dim1 x dim2
v = ... # dim2
v_view = v.unsqueeze(0).expand(-1, len(v)).unsqueeze(2) # batch x dim2 x 1
output = M_batch.bmm(v_view) # batch x dim1 x 1
# optionally output.squeeze(2) to get batch x dim1 output