Why does NumPy (and PyTorch) allow direct multiplicaiton of two row vectors?

oat · August 8, 2021, 8:35am

The following code does work for direct multiplication btw two row vectors of the same shape. But, it’s conceptually wrong.

Dot product can only be conducted for a row vector and a column vector in the same length, as implemented in the np.dot(), torch.mm(), and torch.matmul() functions.

So, I wonder why NumPy and PyTorch allow this in the software design in the first place which could lead to confusion for beginners.

import torch

features = torch.randn((1, 5))
weights = torch.randn_like(features)
bias = torch.randn((1, 1))

output_sum = activation(torch.sum(features * weights) + bias)
# output_sum = activation((features * weights).sum() + bias)

Output:
tensor([[0.1595]])

eqy · August 8, 2021, 9:06am

I think it’s impossible to fully distill the intent of a given numerical libraries interface, since it’s debatable whether NumPy or PyTorch established these conventions or if they were borrowed from predecessors like MATLAB, BLAS, etc…

However, in this case, what is being done is elementwise multiplication (as specified by the * or mul operator). This is usually considered to be distinct from dot product or matrix multiplication.

>>> import torch
>>> a = torch.randn(1, 5)
>>> b = torch.randn(1, 5)
>>> a * b
tensor([[-0.0809,  0.8302, -0.3525, -0.1582, -0.6963]])
>>> a.mul(b)
tensor([[-0.0809,  0.8302, -0.3525, -0.1582, -0.6963]])
>>> a.dot(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: 1D tensors expected, but got 2D and 2D tensors
>>>

oat · August 8, 2021, 10:02am

Thanks for the reply.

Yes, it’s doing element-wise multiplication, and that’s why a sum function is needed to obtain the same output as that from a dot product.