Dimension problem torch.matmul

Hey guys,

I am currently working on converting a Tensorflow project to PyTorch. At the end of the prediction, I get a translation vector field and rotation vector field of the sizes [B, 3, h, w] and [B, 3, 3, h, w] respectively.
In the Tensorflow version, the outputs are [B, h, w, 3] for the translation field and [B, h, w, 3, 3] for the rotation field.

I now need to perform a matrix multiplication on two translation fields. The Tensorflow code would look like this:

# with rot_mat1 and rot_mat being of size **[B, h, w, 3, 3]**
r2r1 = tf.matmul(rot_mat2, rot_mat1)      

All working fine.

When I do the same without thinking in PyTorch:

# with rot_mat1 and rot_mat being of size **[B, 3, 3, h, w]**
r2r1 = tf.matmul(rot_mat2, rot_mat1)  

I get the error:

File "/home/student/lukas/pycharm-sync/depth-from-video/utils.py", line 131, in combine
    r2r1 = torch.matmul(rot_mat2, rot_mat1)
RuntimeError: invalid argument 6: wrong matrix size at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:558

I know the problem is the different dimensions in PyTorch. But do you guys have any quick fix for how I get the desired result as in the Tensorflow version? I know the answer is most likely trivial, but I am somehow stuck at the moment.

Thanks so much!

I’m not sure how TF broadcasts the matrices internally, but I guess it’s creating a new dim4 for the first tensor?
If that’s the case, the second approach should work:

B, H, W = 2, 5, 5

a = torch.randn(B, 3, H, W)
b = torch.randn(B, 3, 3, H, W)

out1 = torch.matmul(a.unsqueeze(1), b)
out2 = torch.matmul(a.unsqueeze(2), b)

Could you store the results from the TF operation and compare it to the PyTorch result?

Thanks for your quick reply!

Your solution works! The problem in my case is just, that H & W are of different sizes. I think i did not explain my problem properly, I am sorry. My two tensors a and b are the same size:

B, H, W = 4, 128, 416

a = torch.randn(B, 3, 3, H, W)
b = torch.randn(B, 3, 3, H, W)

I now tried a different approach, by transposing the tensors before the multiplication to shift them to the sizes [B, H, W, 3, 3):

b = torch.transpose(b, 1, 3)
b = torch.transpose(b, 2, 4)
>>> torch.Size([4, 128, 416, 3, 3])

a = torch.transpose(a, 1, 3)
a = torch.transpose(a, 2, 4)
>>> torch.Size([4, 128, 416, 3, 3])

c = torch.matmul(a, torch.transpose(b, 3, 4))
>>>torch.Size([4, 128, 416, 3, 3])

… and then transposing c back to [B, 3, 3, H, W]. But I am not sure if this somehow messes with the whole multiplication?

Ah OK, in that case I misunderstood the use case.
Yes, you could permute the inputs and reshape it back.
However, from the TF docs it’s unclear to me how the 5 dimensions are handled, so could you explain the underlying operations or at least compare the outputs before continuing the porting of the code. :wink: