How can I do batch-wise a matrix vector row-wise multiplication

If I have a matrix M of size (d1, d2) and a vector V of size d2, doing M*V gives me an output OUT of size (d1, d2), where each row of M has been multiplied by V.

I need to do the same thing batch-wise, where the matrix M is fixed and I have a batch of dB vectors.

In practice, given a tensor M of size (d1, d2) and a tensor V of size (dB, d2), I need to get as output OUT a tensor of size (dB, d1, d2), so that OUT[i] = M*V[i].

How can I efficiently get it?

I think you can get the desired behavior by explicitly inserting the dimension that was implicitly inserted for the single vector case via unqueeze(1) on your “batched” vector.
Example:

import torch

d1 = 16
d2 = 32
dB = 64

m = torch.randn(d1, d2)
v = torch.randn(dB, d2)
out = torch.empty(dB, d1, d2)
for i in range(dB):
    out[i, :, :] = m*v[i]
print(out.shape)
v2 = v.unsqueeze(1)
out2 = m*v2
print(out2.shape)
print(torch.allclose(out, out2))

I think torch.einsum can simply do that. But I’m not sure whether it’s efficient or not.

>>> A = torch.randn((3,4))
>>> B = torch.randn((16,4))
>>> C = torch.einsum("ij,bj->bij", (A, B))
>>> C.size()
torch.Size([16, 3, 4])
>>> C[0]-A*B[0]  # for testing the correctness
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
>>> C[5]-A*B[5]
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])