What is USE_TENSORRT flag used for?

I’ve built Pytorch from source for Jetson Xavier successfully, and I am now interested in accelerating inference. What does the USE_TENSORRT flag accomplish? Seems like without this flag I could convert my model to ONNX, build the engine and call the NVinfer API directly?

Thanks,
Rich

AFAIK:
Hi, when I wanted to compile Pytorch I checked for this flag. AFAIK it enables your Pytorch to use NVIDIA TensorRT library.

Hopefully, any source building works even with e.g. 3.5 capability value in TORCH_ARCH with USE_TENSORRT set.
Quoting TensorRT site:
You can import trained models from every deep learning framework into TensorRT. After applying optimizations, TensorRT selects platform specific kernels to maximize performance on Tesla GPUs in the data center, Jetson embedded platforms, and NVIDIA DRIVE autonomous driving platforms.
With TensorRT developers can focus on creating novel AI-powered applications rather than performance tuning for inference deployment.

CCMake Hint:
obrázok

Supported HW (Cuda Capability > 5):

Anyone got to the bottom of this? I have been trying to make my pytorch models infer with tensorRT but it’s been pretty painful. The ONNX route kind of works but is not ideal. Then there are torch2trt and TRTtorch which are beta/alpha projects from nVidia. Is pytorch working to support a nvinfer/TensorRT backend?

@rafale77
You may want to check out pytorch-quantization maintained by NVIDIA.

According to the documentation they support post-training quantization and quantized fine-tuning.

The USE_TENSORRT flag probably does many things in the build, but at least one of the things it does is try to build the onnx-tensorrt package from github. The thing is though, the submodule pointer in the pytorch repo still points to a 2019 tag/commit from the onnx-tensorrt repo, when there have been several releases since then. That commit builds against a rather old version of TensorRT, I think pre 7.0.x releases, so there is no way onnx-tensorrt builds in the pytorch tree against any reasonably new TensorRT API / version. This leads me to conclude that the USE_TENSORRT flag is not supported, because if it were, the pytorch maintainers would at least update the submodule pointer and document what version of TensorRT to build against. My guess is they started down some route of making a TRT inference backend for pytorch but abandoned it and haven’t cleaned up the repo or cmake/setup.py options.