How to keep the accuacy when convert a pytorch model to TensorRT in FP16 mode

lcxywfe · June 1, 2020, 3:29am

I want to use TensorRT FP16 mode to accelerate my pytorch model, but there is a loss of precision. My question is how can I use the mixed precision training of pytorch, to avoid the loss of accuracy when converting to a TensorRT FP16 model. I have tried the torch.cuda.amp.autocast to training the model, but it still lost some precision.

ptrblck · June 1, 2020, 8:25am

Do you see a loss in accuracy comparing plain PyTorch code to mixed-precision training using torch.cuda.amp?
How are you exporting the model to TensorRT?

lcxywfe · June 1, 2020, 5:37pm

The mixed-precision training is OK, the loss is introduced by the conversion to the TensorRT fp16 engine.
PyTorch -> ONNX -> TensorRT

ptrblck · June 2, 2020, 4:58am

How large is the accuracy drop after the conversion to TensorRT?