Fire up multi-task model inference with threading

Hi guys,this is the big deal!!!


The left one , I run model with gpu alone, not in python3’s threading.
When run multi-task gpu model with threading, that’s so slow as the right show.
Why ???

Hi, had you solve this problem ?
I had the same problem .

Needs clarification. What do you mean by threading?

If you are using Python multi-threading, there is the GIL issue, so, not recommended (also need to pay attention to measuring the running time properly with multi-threading). You should use multi-processing instead.