Dimension problem torch.matmul

lukazso · April 12, 2020, 11:47pm

Hey guys,

I am currently working on converting a Tensorflow project to PyTorch. At the end of the prediction, I get a translation vector field and rotation vector field of the sizes [B, 3, h, w] and [B, 3, 3, h, w] respectively.
In the Tensorflow version, the outputs are [B, h, w, 3] for the translation field and [B, h, w, 3, 3] for the rotation field.

I now need to perform a matrix multiplication on two translation fields. The Tensorflow code would look like this:

# with rot_mat1 and rot_mat being of size **[B, h, w, 3, 3]**
r2r1 = tf.matmul(rot_mat2, rot_mat1)

All working fine.

When I do the same without thinking in PyTorch:

# with rot_mat1 and rot_mat being of size **[B, 3, 3, h, w]**
r2r1 = tf.matmul(rot_mat2, rot_mat1)

I get the error:

File "/home/student/lukas/pycharm-sync/depth-from-video/utils.py", line 131, in combine
    r2r1 = torch.matmul(rot_mat2, rot_mat1)
RuntimeError: invalid argument 6: wrong matrix size at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:558

I know the problem is the different dimensions in PyTorch. But do you guys have any quick fix for how I get the desired result as in the Tensorflow version? I know the answer is most likely trivial, but I am somehow stuck at the moment.

Thanks so much!

ptrblck · April 13, 2020, 4:19am

I’m not sure how TF broadcasts the matrices internally, but I guess it’s creating a new dim4 for the first tensor?
If that’s the case, the second approach should work:

B, H, W = 2, 5, 5

a = torch.randn(B, 3, H, W)
b = torch.randn(B, 3, 3, H, W)

out1 = torch.matmul(a.unsqueeze(1), b)
out2 = torch.matmul(a.unsqueeze(2), b)

Could you store the results from the TF operation and compare it to the PyTorch result?

lukazso · April 13, 2020, 7:56am

Thanks for your quick reply!

Your solution works! The problem in my case is just, that H & W are of different sizes. I think i did not explain my problem properly, I am sorry. My two tensors a and b are the same size:

B, H, W = 4, 128, 416

a = torch.randn(B, 3, 3, H, W)
b = torch.randn(B, 3, 3, H, W)

I now tried a different approach, by transposing the tensors before the multiplication to shift them to the sizes [B, H, W, 3, 3):

b = torch.transpose(b, 1, 3)
b = torch.transpose(b, 2, 4)
b.shape
>>> torch.Size([4, 128, 416, 3, 3])

a = torch.transpose(a, 1, 3)
a = torch.transpose(a, 2, 4)
a.shape
>>> torch.Size([4, 128, 416, 3, 3])

c = torch.matmul(a, torch.transpose(b, 3, 4))
c.shape
>>>torch.Size([4, 128, 416, 3, 3])

… and then transposing c back to [B, 3, 3, H, W]. But I am not sure if this somehow messes with the whole multiplication?

ptrblck · April 13, 2020, 8:44am

Ah OK, in that case I misunderstood the use case.
Yes, you could permute the inputs and reshape it back.
However, from the TF docs it’s unclear to me how the 5 dimensions are handled, so could you explain the underlying operations or at least compare the outputs before continuing the porting of the code.