Is torch.matmul with Mx0 and 0xN tensors a zero filled MxN tensor?

Sujoy_Saraswati · August 10, 2021, 8:00am

Hi,
If there is a torch.matmul of two tensors of shape Mx0 and 0xN, then the resulting tensor (at least on CPU) is an MxN tensor which is zero filled. I see the same behavior on numpy as well. My question - is it necessary for the result tensor to be zero filled or are the values undefined and random? It would be great if there is any document related to this.
Regards,
Sujoy

AlphaBetaGamma96 · August 10, 2021, 9:35am

Hi

It seems that regardless of how the Tensor’s are initialized, it returns a zero-filled Tensor. For example,

>>> import torch
>>> M=10
>>> N=11
>>> 
>>> mat1 = torch.Tensor(M,0) #empty fill
>>> mat2 = torch.Tensor(0,N)
>>> mat1@mat2
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
>>> 
>>> mat1 = torch.randn(M,0) #norm. distributed
>>> mat2 = torch.randn(0,N)
>>> mat1@mat2
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
>>> torch.randn(M,0)
tensor([], size=(10, 0))

I’m not 100% sure why, it’d be best if you got a dev’s opinion on it, but I’d guess it’s linked to the fact the tensors mat1 and mat2 are actually initialized as empty lists, like

>>> mat1 = torch.randn(M,0)
>>> mat1
tensor([], size=(10, 0))

I’d assume because it’s an empty list, it defaults to 0 when you perform the matmul operation!

Sujoy_Saraswati · August 10, 2021, 11:30am

Thank you @AlphaBetaGamma96, I too see the same behavior but not able to find any document/reference that confirms this to be the expected result.

gphilip · August 10, 2021, 11:40am

Here is what I suspect happens, though I couldn’t find the code to match it: torch.matmul() starts off by initializing a matrix of the correct output size filled with zeroes. It then iterates over the relevant dimensions of the two input matrices, computes the various dot products, and replaces each zero in the output matrix with the correct dot product. Once this iteration is done, it returns the output matrix which is now the product. This is how we would write simple code for matrix multiplication.

In this case, since there is nothing to iterate over, the zeroes remain as they are and we get a matrix of the correct size filled with zeroes.

tom · August 10, 2021, 11:52am

If you spell out the formula for matrix multiplication, you see that it has a sum which in your case has no summands. The convention / consistent definition is that empty sums are 0.

Best regards

Thomas

Sujoy_Saraswati · August 10, 2021, 3:31pm

Thanks, so is it correct to conclude that if such an op with matmul (Mx0, 0xN) comes up during the execution of a model, then further ops can expect the semantics that the result to be zero filled MxN?

tom · August 10, 2021, 3:50pm

The result is all zeros and you can rely on that. There will not be any automatic short-circuiting for “known zeros”. I am not sure if you meant the former or the latter.

Best regards

Thomas

Sujoy_Saraswati · August 11, 2021, 3:09am

Thanks, I had meant the former.