Sum numpy array with pytorch tensor

I was doing some tests with pytorch tensor and something just came to my attention. Here are the steps to reproduce:

  1. Create a pytorch tensor (either on cpu or gpu);
  2. Create a numpy array;
  3. Sum the pytorch tensor with the numpy array (this fails);
  4. Sum the numpy array with the pytorch tensor (this works and return a cpu tensor);

So, is this behavior expected? I could not find any documentation or topic on that.

Example code follows:

>>> a = torch.tensor((10,10), device='cuda')
>>> a
tensor([10, 10], device='cuda:0')
>>> c = np.array((10,10))
>>> c
array([10, 10])
>>> a + c
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: add() received an invalid combination of arguments - got (numpy.ndarray), but expected one of:
 * (Tensor other, Number alpha)
 * (Number other, Number alpha)

>>> c + a
tensor([20, 20])
>>> b = torch.tensor((10,10))
>>> b
tensor([10, 10])
>>> b + c
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: add() received an invalid combination of arguments - got (numpy.ndarray), but expected one of:
 * (Tensor other, Number alpha)
 * (Number other, Number alpha)

>>> c + b
tensor([20, 20])
>>>
1 Like

Hi,

I think this is mostly expected, my understanding is that:

  • When you do tensor + array, then the sum op from pytorch is used and we do not support adding a numpy array to a Tensor, you should use torch.from_numpy() to get a Tensor first.
  • When you do array + tensor, then numpy’s sum op is used and they seem to be doing weird things when given a tensor: like moving it to cpu then returning another tensor? Not sure why though you would need to check numpy’s code.

I agree tensor + array returning an error is expected. I just thought array + tensor would be handled the same way, but, as you mentioned, if it is handled at numpy so, that is up to them. I tried to find the source code for the ‘+’ operation, that is different from numpy.sum as it returns an error if I try to sum to a tensor, but I could not find it now.

As it was just something curious, I’ll leave with it for now until I have some time to look carefully to numpy source code.

Thanks for your reply!

I think what happens with numpy is that it find that tensors can be handled as sequences and extract the elements from it one by one, hence converting the gpu tensor to a cpu one. Not sure why it returns a tensor and not a numpy array though…