I’m trying to save a serialized tensorRT optimized model using torch_tensorrt from one environment and then load it in another environment (different GPUs. one has Quadro M1000M, and another has Tesla P100.
In both environments I don’t have full sudo control where I can install whatever I want (i.e. can’t change nvidia driver), but I am able to install different cuda toolkits locally, same with pip installs with wheels.
I have tried:
env #1 =
Tesla P100,
Nvidia driver 460,
CUDA 11.3 (checked via torch.version.cuda). nvidia-smi shows 11.2. has many cuda versions installed from 10.2 to 11.4
CuDNN 8.2.1.32
TensorRT 8.2.1.8
Torch_TensorRT 1.0.0
Pytorch 1.10.1+cu113 (conda installed)
env #2 =
Quadro M1000M
Nvidia driver 455
CUDA 11.3(checked via torch.version.cuda, backwards compatibilty mode I believe, but technically 11.3 requires 460+ nvidia driver according to the compatibility table). nvidia-smi shows 11.1. has 10.2 version available aside from 11.3 I installed.
CuDNN 8.2.1.32
TensorRT 8.2.1.8
Torch_TensorRT 1.0.0
Pytorch 1.10.1+cu113 (pip installed)
So as you can see the only difference is really the GPU and the NVIDIA driver (455 vs 460).
Is this supposed to work?
On env#1, I can torch_tensorrt compile any models
On env#2, I run into issues if I try to compile any slightly complex models (i.e. resnet34) where it says:
WARNING: [Torch-TensorRT] - Dilation not used in Max pooling converter
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.6.3 but loaded cuBLAS/cuBLAS LT 11.5.1
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 1: [wrapper.cpp::plainGemm::197] Error Code 1: Cublas (CUBLAS_STATUS_NOT_SUPPORTED)
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
If I try to “torch.jit.load” any model made in env #1 (even the simplest ones like a model with 1 conv2d layer) on env #2, I get the following error msg:
~/.local/lib/python3.6/site-packages/torch/jit/_serialization.py in load(f, map_location, _extra_files)
159 cu = torch._C.CompilationUnit()
160 if isinstance(f, str) or isinstance(f, pathlib.Path):
→ 161 cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
162 else:
163 cpp_module = torch._C.import_ir_module_from_buffer(
RuntimeError: [Error thrown at core/runtime/TRTEngine.cpp:44] Expected most_compatible_device to be true but got false
No compatible device was found for instantiating TensorRT engine
Based on the raised warnings and errors, I would recommend to create an issue in the Torch-TensorRT GitHub repository so that the devs could try to debug the issues.