I am using PyTorch 1.1 and I found the following problem:

>>> import torch

>>> torch.cuda.is_available()

True

>>> x = torch.tensor(1).cuda()

>>> x

tensor(1, device=‘cuda:0’)

>>> y = torch.tensor(2)

>>> x+y

tensor(3, device=‘cuda:0’)

>>> y+x

tensor(3)

>>> (y+x).device

device(type=‘cpu’)

>>> (x+y).device

device(type=‘cuda’, index=0)

x is a cuda tensor while y is a cpu tensor. The device of the resulting tensor is different when the order of addition changes, is it an expected behavior? And when should the resulting tensor on gpu or on cpu?