If I have a matrix M of size (d1, d2) and a vector V of size d2, doing M*V gives me an output OUT of size (d1, d2), where each row of M has been multiplied by V.

I need to do the same thing batch-wise, where the matrix M is fixed and I have a batch of dB vectors.

In practice, given a tensor M of size (d1, d2) and a tensor V of size (dB, d2), I need to get as output OUT a tensor of size (dB, d1, d2), so that OUT[i] = M*V[i].

I think you can get the desired behavior by explicitly inserting the dimension that was implicitly inserted for the single vector case via unqueeze(1) on your “batched” vector.
Example:

import torch
d1 = 16
d2 = 32
dB = 64
m = torch.randn(d1, d2)
v = torch.randn(dB, d2)
out = torch.empty(dB, d1, d2)
for i in range(dB):
out[i, :, :] = m*v[i]
print(out.shape)
v2 = v.unsqueeze(1)
out2 = m*v2
print(out2.shape)
print(torch.allclose(out, out2))