Hi, I have a tensor x1 4x3x2x2, and a tensor x2 4x1. I would like tensor x1 and x2 multiply for each element along axis 0 (which has a dimension of 4). Each such multiplication would be between a tensor 3x2x2 and a scalar, so the result would be a tensor 4x3x2x2.

A for loop implementation would be below, is there a better (parallel) implementation, perhaps using one of pytorch multiply functions? Thanks a lot!

x1 = torch.rand(4, 3, 2, 2)
x2 = torch.rand(4, 1)
for i in range(4):
print(x1[i]*x2[i])

I don’t know which approach would be more efficient. (I’ve never tested it.)

I could see torch.einsum() having a little extra overhead because it has
to parse the 'ijkl, im -> ijkl' string. It could also be (maybe in certain
situations) significantly less efficient if its generality prevents it from performing
the tensor multiplications in the most optimal way. But maybe it’s smart
enough to figure out the optimal approach.