I run a python loop which is parallelized with multiprocessing lib. Each iteration of the loop calls a trained torch model so as to make a prediction. It all works on CPU.
for _ind in inds: # process each sample f = pool.apply_async(function_launching_torch_model, [some_parameters])
It appears that the torch model is very slow, much more than when the loop is sequential. I guess there’s a “conflict” in the use of the procs?
Anyone would have a suggestion to keep a parallel loop while getting good performance?
Thnaks in advance !