Allocation of Tensor on CUDA fails

skaldesh · February 15, 2022, 12:57pm

Hi, want to create a tensor from an existing image buffer copied directly to the GPU during creation,

The model was loaded like this:
mod_ = torch::jit::load("/data/deeplabv3-mobilenetv3-large.pt", torch::Device(torch::kCUDA, 0));

To create the tensor:
torch::from_blob((void*)buffer, {1, 3, 520, 520}, torch::TensorOptions().dtype(torch::kFloat32).device(torch::Device(torch::kCUDA, 0))));

But this fails with this error:
Specified device cuda:0 does not match device of data cuda:-2

I do not understand where this cuda:-2 comes from.
A workaround I found is to create the tensor on CPU and then use tensor.to(cudadevice). But I want to understand why the first one fails.

hyhuang00 · June 23, 2022, 5:35am

Did you managed to solve the problem? I met the same error!

skaldesh · June 27, 2022, 1:18pm

No, so far I am still using my workaround

Amit_Gupta · October 19, 2022, 8:41pm

I am also running in same issue. And problem with using to(device) is that it creates a non_leaf tensor, which gives further headache. How did you get the gradients of tensors moved via .to(device) semantics?

ptrblck · October 19, 2022, 10:01pm

You could create the tensor directly on the right device, if possible or you could call .retain_grad() on the intermediate tensors.

Shobhit_Narayanan · November 10, 2022, 10:49am

How? This is still an issue.

Jack_Huang1 · September 15, 2023, 8:40am

I also met the same error, so the solution is just use .to(device) to instead?