How to do ONNX to TensorRT in INT8 mode?

GB_K · August 14, 2020, 8:47am

Hello.

I am working with the subject, PyTorch to TensorRT.

With a tutorial, I could simply finish the process PyTorch to ONNX.
And, I also completed ONNX to TensorRT in fp16 mode.

However, I couldn’t take a step for ONNX to TensorRT in int8 mode.
Debugger always say that `You need to do calibration for int8*.*

Does anyone know how to do convert ONNX model to TensorRT int8 mode?

Thank you in adavance

seungjun · September 25, 2020, 8:07am

This thread says exporting quantized pytorch models to onnx is not yet supported.

You can try quantizing after you export pytorch model to onnx by using onnxruntime.

Otherwise, you may want to check out if direct export from pytorch to tensorrt supports quantized models.