What is the difference between using tensor.cuda() and tensor.to(torch.device(“cuda:0”))

Leockl · July 15, 2020, 5:32am

Using PyTorch, what is the difference between the following two methods in sending a tensor to GPU:

Method 1:

X = np.array([[1, 3, 2, 3], [2, 3, 5, 6], [1, 2, 3, 4]])
X = torch.DoubleTensor(X).cuda()

Method 2:

X = np.array([[1, 3, 2, 3], [2, 3, 5, 6], [1, 2, 3, 4]])
X = torch.DoubleTensor(X)

device = torch.device("cuda:0")
X = X.to(device)

Similarly, is there any difference in the same two methods above when applied to sending a model to GPU:

Method A:

gpumodel = model.cuda()

Method B:

device = torch.device("cuda:0")
gpumodel = model.to(device)

Many thanks in advance!

iffiX · July 15, 2020, 5:47am

there is no difference

ptrblck · July 15, 2020, 5:53am

There might be a difference, if you were resetting the default CUDA device via torch.cuda.set_device() as seen in this code snippet:

torch.cuda.set_device('cuda:1')
x = torch.randn(1).cuda()
print(x)
> tensor([0.9038], device='cuda:1') # uses the default device now

y = torch.randn(1).to('cuda:0')
print(y)
> tensor([-0.7296], device='cuda:0') # explicitly specify cuda:0

Leockl · July 15, 2020, 5:55am

Ok thanks @iffiX for confirming they are both essentially doing the same thing.

iffiX · July 15, 2020, 5:56am

Ah yes, thats important, I forgot this

Leockl · July 15, 2020, 5:57am

Ok many thanks @ptrblck for the more detailed answer where the 2nd method is specifying which GPU device to use and the 1st method is just using the default GPU device.

iacob · April 15, 2021, 7:04am

Their syntax varies slightly, but they are equivalent:

⠀	.to(name)	.to(device)	.cuda()
CPU	`to('cpu')`	`to(torch.device('cpu'))`	`cpu()`
Current GPU	`to('cuda')`	`to(torch.device('cuda'))`	`cuda()`
Specific GPU	`to('cuda:1')`	`to(torch.device('cuda:1'))`	`cuda(device=1)`

Note: the current cuda device is 0 by default, but this can be set with torch.cuda.set_device().