Crash when using torch::from_blob with CUDA

I’m using libtorch 1.4.0 on my MSI P65 laptop (Win10, RTX 2060, NV driver 441.22, Cuda 26.21.14.4122).
When using the CPU, this code works as expected:

auto tensor1=torch::zeros({1, 1, 2, 2},torch::kCPU);
tensor1[0][0][0][0]=42;
cout << "tensor1:" << endl << tensor1 << endl << endl;

float tensor2Data[]{42,0,0,0};
auto tensor2=torch::from_blob(tensor2Data,{1, 1, 2, 2},torch::kCPU);
cout << "tensor2:" << endl << tensor2 << endl << endl;

And it shows:

tensor1:
(1,1,.,.) =
42 0
0 0
[ CPUFloatType{1,1,2,2} ]

tensor2:
(1,1,.,.) =
42 0
0 0
[ CPUFloatType{1,1,2,2} ]

Press to close this window…

However when switching to CUDA, with this code:

auto tensor1=torch::zeros({1, 1, 2, 2},torch::kCUDA);
tensor1[0][0][0][0]=42;
cout << "tensor1:" << endl << tensor1 << endl << endl;

float tensor2Data[]{42,0,0,0};
auto tensor2=torch::from_blob(tensor2Data,{1, 1, 2, 2},torch::kCUDA);
cout << "tensor2:" << endl << tensor2 << endl << endl;

It crash at from_blob without any error message:

tensor1:
(1,1,.,.) =
42 0
0 0
[ CUDAFloatType{1,1,2,2} ]

Press to close this window…

Running a model on CUDA works - it’s only by creating a CUDA tensor with from_blob that I get this crash, without any hint or explanation.
Any idea what’s going wrong ? Is there a workaround I could use to easily transfer blocks of data from the RAM to a CUDA tensor ?

Also, I tried creating the tensors CPU-side (first code) and then do a tensor.to(torch::kCUDA) but it didn’t transfer the tensors to CUDA.
I could also transfer values one by one to a CUDA tensor using [][][][] like with tensor1, but it takes a really long time.

That’s because the data is on the CPU and you’re instructing PyTorch to treat the CPU-pointer tensor2Data as a GPU pointer. You’d need to make it a tensor first and then move to CUDA.

As an aside, tensor indexing as in tensor1[0][0][0][0]=42; is not efficient if you do it at scale. This statement will, under the hood, call tensor1[...].fill_, it is not writing to some memory location.

Best regards

Thomas

Are you suggesting I do this ?

float tensor2Data[]{42,0,0,0};
auto tensor2=torch::from_blob(tensor2Data,{1, 1, 2, 2},torch::kCPU);
tensor2.to(torch::kCUDA);

Because I tried, but then when I do

cout << "tensor2:" << endl << tensor2

All I get is:

tensor2:
(1,1,.,.) =
42 0
0 0
[ CPUFloatType{1,1,2,2} ]

Which indicates tensor2 is still on the CPU (CPUFloatType), it didn’t move to the CPU.

Ohhh I just figured out I actually needs to do:

tensor2=tensor2.to(torch::kCUDA);

It works now :smile:

1 Like