[CPP] C++ tensor cannot move from cpu to gpu

Weihao_Yuan · January 26, 2019, 8:18am

module->to(at::kCUDA);

or

input_tensor.to(at::kCUDA);

When I try to move the model to gpu and use gpu to do the forward, it returns me an error and lets me report a bug to PyTorch. The error message is like following:

terminate called after throwing an instance of 'c10::Error'
  what():  p ASSERT FAILED at /pytorch/c10/core/impl/DeviceGuardImplInterface.h:130, please report a bug to PyTorch. DeviceGuardImpl for cuda is not available (getDeviceGuardImpl at /pytorch/c10/core/impl/DeviceGuardImplInterface.h:130)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f623ebd50f1 in /home/will/Softwares/libtorch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f623ebd4a2a in /home/will/Softwares/libtorch/lib/libc10.so)
frame #2: at::native::to(at::Tensor const&, c10::Device, c10::ScalarType, bool, bool) + 0x18d9 (0x7f623f73fd99 in /home/will/Softwares/libtorch/lib/libcaffe2.so)
frame #3: at::TypeDefault::to(at::Tensor const&, c10::Device, c10::ScalarType, bool, bool) const + 0x1b (0x7f623f8ea7eb in /home/will/Softwares/libtorch/lib/libcaffe2.so)
frame #4: torch::jit::script::Module::to_impl(c10::optional<c10::Device> const&, c10::optional<c10::ScalarType> const&, bool) + 0x21b (0x7f624930affb in /home/will/Softwares/libtorch/lib/libtorch.so.1)
frame #5: torch::jit::script::Module::to_impl(c10::optional<c10::Device> const&, c10::optional<c10::ScalarType> const&, bool) + 0x69 (0x7f624930ae49 in /home/will/Softwares/libtorch/lib/libtorch.so.1)
frame #6: torch::jit::script::Module::to_impl(c10::optional<c10::Device> const&, c10::optional<c10::ScalarType> const&, bool) + 0x69 (0x7f624930ae49 in /home/will/Softwares/libtorch/lib/libtorch.so.1)
frame #7: torch::jit::script::Module::to(c10::Device, bool) + 0x26 (0x7f624930b356 in /home/will/Softwares/libtorch/lib/libtorch.so.1)
frame #8: main + 0x181 (0x42d277 in /home/will/Softwares/libtorch/example/project/cmake-build-debug/example-app)
frame #9: __libc_start_main + 0xf0 (0x7f623e273830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #10: _start + 0x29 (0x42c859 in /home/will/Softwares/libtorch/example/project/cmake-build-debug/example-app)

The torch::cuda::is_available() returns me 0. But I do have 2 gpu and can use it with pytorch in python well.

Weihao_Yuan · January 26, 2019, 9:44am

solved…

asa008 · April 16, 2019, 7:34am

Would you please tell me that how did you solved this problem?

Weihao_Yuan · April 16, 2019, 1:27pm

It is a problem due to the version of cuda, cudnn. It is solved after I installed a new cuda and cudnn (cuda 9.0 and cudnn 7). I guess currently pytorch cpp library can only work on some version. It is not stable.

Weihao_Yuan · April 16, 2019, 1:28pm

by the way, how to delete this question?

asa008 · April 17, 2019, 3:14am

thank you very much, some trouble occured on my win10, cuda10, libtorch 1.0.1

mhubii · April 19, 2019, 10:28am

hm may the problem actually be that libtorch does not know onto which gpu to move the tensor? If that is the case, you could move your tensors to a certain gpu like so

module->to(torch::Device("cuda:0"));

where all following gpus are indexed in incrementing order, cuda:1, cuda:2 and so on

asa008 · April 22, 2019, 12:52pm

tried, it’s not the truth

NickKao · September 24, 2019, 7:48am

May same with me.

Use wrong libtorch lib.