Expected device cuda:0 but got device cpu though the two tensors are on cuda

Hi, I thought i’m doing think right, but got this error

features = torch.from_numpy(features).cuda()
print(features.device) #  shows cuda:0
dists = self.calculate_distance(features).cuda()
print(dists.device) # shows cuda:0
dists.add_(torch.tril(100 * torch.ones(len(features), len(features))))

But the last line fails with error RuntimeError: expected device cuda:0 and dtype Float but got device cpu and dtype Float

Anything I’m doing wrong?

Thank you

Hi @jpainam,

You have to move the torch.ones Tensor to the GPU:

dists.add_(torch.tril(100 * torch.ones(len(features), len(features)).cuda()))

Thank you, It works,
but, is there a way to just move everything to cuda if my class doesn’t extend nn.Module? Doing it for each tensor can be a little bit too much.

You should be able by doing:

1 Like

I am doing

  1. torch.set_default_tensor_type(‘torch.cuda.FloatTensor’)
  2. source, target = source.cuda(), target.cuda()

But still getting the following error on google colab

RuntimeError Traceback (most recent call last)
in ()
7 start_time = time.time()
8 # train
----> 9 train_loss = train(model, data_train, optimizer, criterion, CLIP, DEVICE)
10 # stop clock
11 end_time = time.time()

3 frames
/content/train.py in train(model, data, optimizer, criterion, clip, cohort_size)
47 # calculate loss
—> 48 loss = criterion(output, trg)
50 # backward propogation

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
–> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
431 def forward(self, input, target):
–> 432 return F.mse_loss(input, target, reduction=self.reduction)

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce, reduction)
2541 else:
2542 expanded_input, expanded_target = torch.broadcast_tensors(input, target)
-> 2543 ret = torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
2544 return ret

RuntimeError: expected device cuda:0 but got device cpu