Can Tensors on different devices add together?

Danil_Hu · March 12, 2022, 8:05am

Hi, everyone!

I found a curious stuff recently. As far as I know, when you want to do some operations on two tensors, you should make sure that they are on the same device. But when I write my code like this, it runs unexpectly well

import torch
a = torch.tensor(1, device='cuda')
print(a.device)
b = torch.tensor(2, device='cpu')
print(b.device)
torch(a+b)


cuda:0
cpu
tensor(3, device='cuda:0')

But it can’t work in my code like this:

pts_1_tile = torch.tensor([[0], [0]], dtype=torch.float32)
torch.add(pred_4pt_shift, pts_1_tile)

Here pred_4pt_shift is an intermediate result from a sub-Net, and it is a tensor on GPU. My question is that why the first code can work but the second one reports this different device error?

ptrblck · March 13, 2022, 12:40am

In your use case b would be interpreted as a scalar value and should be forwarded as an argument to the CUDA kernel directly if I’m not mistaken.
E.g. adding a dimension to b will break the code as seen here:

# works
a = torch.tensor(1, device='cuda')
print(a.device)
b = torch.tensor(2, device='cpu')
print(b.device)
print(a+b)
print(torch.add(a, b))

# works
a = torch.tensor([1], device='cuda')
print(a.device)
b = torch.tensor(2, device='cpu')
print(b.device)
print(a+b)
print(torch.add(a, b))

# fails
a = torch.tensor(1, device='cuda')
print(a.device)
b = torch.tensor([2], device='cpu')
print(b.device)
print(a+b)
print(torch.add(a, b))

Danil_Hu · March 13, 2022, 12:58am

Thank you!
So the right thing to do is that everytime a tensor is defined, I need to make sure that the tensor is on the same device as my model Right?

ptrblck · March 13, 2022, 1:01am

Yes, I would recommend to explicitly move the data to the desired device in order to avoid any surprising issues.
However, as already described, scalar values are fine to use with tensors on e.g. a GPU:

a = torch.randn(10, device='cuda')
print(a+2)

Danil_Hu · March 13, 2022, 1:03am

Thanks a lot! Have a nice day