Can an int8 model derived from pytorch's QAT training be converted directly to tensorRT?

Can an int8 model derived from pytorch’s QAT training be converted directly to tensorRT? Because the int8 model trained by QAT failed to convert onnx, I want to try to convert directly to tensorRT for GPU inference.

Hi @lishanlu136 , have you tried the steps outlined in this tutorial: Deploying Quantization Aware Trained models in INT8 using Torch-TensorRT — Torch-TensorRT v1.3.0 documentation

Thank you, I read the documentation carefully, is this tutorial using the library provided by TensorRT to QAT the pytorch model?

I think you can try using something along the lines of the following:

its still in early prototype phase i believe but theoretically that should work if your model is traceable