has anyone compared the throughput of a model optimized by both jit and tensorRT?
It seems to depend on the specific network. The biggest speedup I’ve seen was close to three times as fast as PyTorch. The lowest was about one and a quarter times as fast.
Thanks for sharing. That’s not too bad. could you share which models showed such speedup? and perhaps the speedup data on some models you have tried? I would really appreciate any details.
TensorRT much faster
jit and trt are two different things.
our team are looking into pytorch for a long time. jit is front-end while trt is back-end.
Always, jit is from python. it optimizes pytorch codes and tries to merge some ops before running the forward. If you dig it, you will find jit and eager call the same op set and just little diff.
However, trt is another accelerated engine, which depends on Nvidia’s GPU. trt will fuse the ops as possible as it can. fused kernel can reduce cost of discrete kernel calls.
besides, trt has other exiting features.
Is it possible to optimize a model using torch2trt then scripting the optimized model using torch script/trace? If yes, Will I get better optimization?
It is not recommended. torch2trt is designed to help developers deploy their script/trace model in TensorRT.
In detail, script/trace just interpreters original PyTorch into IR graph and then torch2trt maps and fuses such graph in trt.
I never try the opposite flow. If you succeed, please let me know.