I know that using a tensorrt gives better performance then cuda libtorch. I built libtorch from source code with use_tensorrt
flag. Will this give me a performance improvement on a GPU inference? Or is it used for another purpose?
Where are you passing this argument when building libtorch
?
Are you using the USE_TENSORRT=1
env variable for Caffe2 as described here?
I used CMake directly (-DUSE_TENSORRT
, C++ package manager). CMake found tensorrt. I got nvonnxparser_static.lib library (windows static build) after compilation and then linked my application with nvinfer and nvonnxparser_static libraries.