Pytorch model deployment: does TensorRT speedup inference on desktop GPUs?

Tgaaly · May 25, 2018, 12:09am

What are the recommended ways to deploy a pytorch model to a desktop machine (with 1080ti GPUs) for fast inference?

Related to this, does NVIDIA TensorRT speed up PyTorch model inference on 1080ti GPUs? If so, are then any benchmarks showing by how much for typical deep learning models?

Would caffe2 be another option?

oscar · August 17, 2018, 11:28pm

We typically see a 2x speedup using TensorRT, with an additional 2x if you go to unit8. It is pretty amazing actually.

Tgaaly · October 4, 2018, 5:56pm

Same for me, I have seen a large speedup using tensorRT. Don’t remember the exact number though. But it was substantial.

alexbuyval · December 4, 2018, 11:20am

@oscar @Tgaaly

Guys, could you point me on any tutorial or guide how to inference a pytorch model by TensorRT?

Thank you!