Pytorch script model in C++ is slow on prediting the second image

When using script model in C++ for image detection task, it spends about 2 seconds to predict the first image, and more than 60 seconds for the second image, and then less than 200 milliseconds (as we execpted) for the rest images.
I have no idea why it spend too much time on the second image. Any ideas? Thanks

1 Like

In case you are using the GPU, did you synchronize the code before starting and stopping the timers?
If so, are you seeing the slowdown only in the first iterations (this would be expected, if e.g. cudnn.benchmark is used or in case the JIT is optimizing the graph)?

Only in the first two iterations in my experiment

Is there any method of optimizing the graph in advance?

I don’t think there is currently a way to store the optimized graph and if I’m not mistaken the first iterations would then rerun the optimization (but @tom would know and correct me, if I’m wrong).

Hi. I also have experienced this behavior with the same time gaps (second iteration x30 slower than the first one) but only on Windows. The problem does not seem to exist on Linux

in case anyone is still experiencing this, if your input is of fixed size, you can disable optimization and prevent unwanted compilations by using

 torch::jit::FusionStrategy static0 = { {torch::jit::FusionBehavior::STATIC, 0} }; 
 torch::jit::setFusionStrategy(static0); 
 torch::jit::setGraphExecutorOptimize(false);