C++ again: CUDA driver version is insufficient for CUDA runtime version

I follow the official tutorial, and want to convert the model under Python to call under c++.

The model was successfully converted to. pt format in Python environment, but on the C + + side, when I read the model and convert it to GPU, an error occurred:

ai@ai:build$ ./resnet /home/ai/pan/brpc/code/torchcpp-tutorial/traced_resnet_model.pt
THCudaCheck FAIL file=../aten/src/THC/THCGeneral.cpp line=47 error=35 : CUDA driver version is insufficient for CUDA runtime version
terminate called after throwing an instance of 'std::runtime_error'
  what():  cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at ../aten/src/THC/THCGeneral.cpp:47
  • python
import torch
import torchvision

# An instance of your model.
model = torchvision.models.resnet18().cuda()
model = model.eval()

# An example input you would normally provide to your model's forward() method.
example = torch.rand(1, 3, 224, 224).cuda()

# Use torch.jit.trace to generate a torch.jit.ScriptModule via tracing.
traced_script_module = torch.jit.trace(model, example)
output = traced_script_module(torch.ones(1,3,224,224).cuda())
traced_script_module.save("traced_resnet_model.pt")
print(output)

it works.

  • c++
#include <torch/script.h>

#include <iostream>
#include <memory>

int main(int argc, const char* argv[]) {
    if (argc != 2) {
        std::cerr << "usage: example-app <path-to-exported-script-module>\n";
        return -1;
    }


    torch::jit::script::Module module;
    try {
        // Deserialize the ScriptModule from a file using torch::jit::load().
        module = torch::jit::load(argv[1]);
    }
    catch (const c10::Error& e) {
        std::cerr << "error loading the model\n";
        return -1;
    }


    module.to(at::kCUDA);
    std::cout << "ok\n";

    // Create a vector of inputs.
    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(torch::ones({1, 3, 224, 224}).to(at::kCUDA));

// Execute the model and turn its output into a tensor.
    at::Tensor output = module.forward(inputs).toTensor();
    std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
}

The above error occurred.

  • configuration
python 3.6
ubuntu 16.04
NVIDIA-SMI 430.64
CUDA Version 10.1
cudnn 7.6.6
torch 1.5.0+cu101

meets the requirements of :https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

If cuda wokes in python, it can be proved that the nvidia driver and cuda are compatible. Why can errors occur in C++?I’ve modified the nvidia driver several times, but it doesn’t solve the problem.
Thanks for any suggestion.

Reduce CUDA version to 9.2 and the error disappears.

I also have the same bug on the same torch version 1.5+cu101.
The error happens when set torch.cuda.set_device(i) with multiple process.