How to do ONNX to TensorRT in INT8 mode?

This thread says exporting quantized pytorch models to onnx is not yet supported.

You can try quantizing after you export pytorch model to onnx by using onnxruntime.

Otherwise, you may want to check out if direct export from pytorch to tensorrt supports quantized models.