At inference time, or when doing hyper-parameter search, I would like to have multiple processes running one pytorch model each (independently from each others). I’m trying to achieve this on CPU. In the case of hyperparameter search for instance, I wrote some code to have a pool of processes trying different hyperparameters configurations at the same time. I experimented with pytorch and a basic scikit logreg:
-
When using the scikit logistic regression, there is a clear gain in using the multiprocessing pool. The time spent with one process is about twice the time using two processes.
-
However, when using pytorch models, there is no gain in using multiprocessing at all. In fact the more processes I use, the slower it becomes.
Therefore I was wondering: Could MKL be responsible of this behavior? Since the matrix operations are multi-threaded, could it be possible that multiple processes running MKL are competing too much with each others, reducing the efficiency of the training or inference ?