Multiplication of 4D and 3D creating linear comibnation

Hi,
I have 3D and 4D tensors as follows:
A = [16,16,13] and B = [16,16,13,49]

And I would like to get tensor [16,16,49] without performing loop iterations.
Currently I’m doing:
for in_i in range(16):
for in_j in range(16):
results[in_i,in_j,:] = torch.matmul(A[in_i, in_j , :], B[in_i, in_j, :, :])

Any idea?
Thanks in advance!

You could add a dummy dimension in A in dim2:

a = torch.randn(16, 16, 1, 13)
b = torch.randn(16, 16, 13, 49)

res = torch.matmul(a, b).squeeze()

results = torch.zeros(16, 16, 49)
for in_i in range(16):
    for in_j in range(16):
        results[in_i,in_j,:] = torch.matmul(a[in_i, in_j , :], b[in_i, in_j, :, :])

print(torch.allclose(res, results))
> True

Thanks a lot!
is the function matmul considered to be an heavy function?

Do you mean heavy function regarding the computation?
It will be faster than the nested loop, but I’m not sure if it’ll use more memory.