Matrix-vector multiply (handling batched data)

I am trying to get a matrix vector multiply over a batch of vector inputs.
Given:

# (batch x inp)
v = torch.randn(5, 15)
# (inp x output)
M = torch.randn(15, 20)

Compute:

# (batch x output)
out = torch.Tensor(5, 20)
for i, batch_v in enumerate(v):
    out[i] = (batch_v * M).t()

But (i) multiplication seems to expect both inputs with equal dimensions resulting in a RuntimeError: inconsistent tensor size at /home/enrique/code/vendor/pytorch/torch/lib/TH/generic/THTensorMath.c:623, and (2) I’d be cool if I didn’t have to loop over row vectors.

* is elementwise multiplication, if you’re using Python3 you can use @ operator as matrix-vector and matrix-matrix multiplication. Otherwise you can always resort to batch_v.mm(M) (since batched vector matrix is just matrix-matrix multiplication).

1 Like

We also have batched operations like bmm, baddbmm and some others that might allow you to eliminate the loop in python.

Thanks for the hints.
Indeed, I can solve this by just using matrix-matrix mulptiplies (I got confused with the batch dim), which is what my code was unknowingly trying to accomplish.
Just for the record, @ is actually only for versions > 3.4.

I think these are the only Py3 versions that we support.

I was wondering… Matrix batched-vector multiply is just matrix-matrix multiply, but what about the inverse situation: Batched-matrix vector multiply?

Given a batched-matrix M (batch dim1 x dim2) and a vector v (dim2), is there something like a bmv method, similar to bmm that would give you (batch x dim1)?

No, there’s not. But you could do it by reshaping your vector to look like a matrix (no memory copy there, just stride tricks):

M_batch = ... # batch x dim1 x dim2
v = ... # dim2
v_view = v.unsqueeze(0).expand(-1, len(v)).unsqueeze(2) # batch x dim2 x 1
output = M_batch.bmm(v_view) # batch x dim1 x 1
# optionally output.squeeze(2) to get batch x dim1 output
3 Likes

Alright, thanks for the info. Given efficient reshaping capabilities, all this operations can easily be implemented by the user.

I got it to work however v.unsqueeze(0).expand(-1, len(v)) seems to give an empty tensor - on which unsqueeze(2) obviously fails.

I got it to work passing the batch size instead of -1.

v.unsqueeze(0).expand(M_batch.size(0), len(v)).unsqueeze(2)
1 Like

Ah, my bad! You have to specify all sizes for expand.