Pytorch script model in C++ is slow on prediting the second image

3yuan · July 20, 2021, 12:14pm

When using script model in C++ for image detection task, it spends about 2 seconds to predict the first image, and more than 60 seconds for the second image, and then less than 200 milliseconds (as we execpted) for the rest images.
I have no idea why it spend too much time on the second image. Any ideas? Thanks

ptrblck · July 20, 2021, 8:25pm

In case you are using the GPU, did you synchronize the code before starting and stopping the timers?
If so, are you seeing the slowdown only in the first iterations (this would be expected, if e.g. cudnn.benchmark is used or in case the JIT is optimizing the graph)?

3yuan · July 21, 2021, 1:55am

Only in the first two iterations in my experiment

3yuan · July 23, 2021, 8:56am

Is there any method of optimizing the graph in advance?

ptrblck · July 23, 2021, 5:40pm

I don’t think there is currently a way to store the optimized graph and if I’m not mistaken the first iterations would then rerun the optimization (but @tom would know and correct me, if I’m wrong).

Xavier31 · July 27, 2021, 1:15pm

Hi. I also have experienced this behavior with the same time gaps (second iteration x30 slower than the first one) but only on Windows. The problem does not seem to exist on Linux

Xavier31 · October 11, 2023, 8:53am

in case anyone is still experiencing this, if your input is of fixed size, you can disable optimization and prevent unwanted compilations by using

 torch::jit::FusionStrategy static0 = { {torch::jit::FusionBehavior::STATIC, 0} }; 
 torch::jit::setFusionStrategy(static0); 
 torch::jit::setGraphExecutorOptimize(false);