I have been playing around with the C++ Frontend for PyTorch on my Laptop (Intel® Core™ i7-4600U) and were able to include PyTorch into my CPP app by following the MNIST example (https://github.com/goldsborough/examples/blob/cpp/cpp/mnist/mnist.cpp).
My app already utilizes parallelization to some degree, so I would like to run training / inference on a single thread. However, I was not able to tell PyTorch that - it would always use all cores. Here is what I did so far:
I could not find any “torch.set_num_threads(1)” so I searched the PyTorch sources for anything related and found at::set_num_threads(1);
Setting the openmp environment variable “OMP_NUM_THREADS=1”
Setting the MKL environment variable “MKL_NUM_THREADS=1”
Further investigation of the source code revealed, that caffee2 uses a ThreadPool which is initialized with cpuinfo_get_processors_count() from the cpuinfo lib (https://github.com/pytorch/cpuinfo). I did not find a way to set this from the outside
note, that I did not compile PyTorch myself, but used the library provided on the website. There seem to be a lot of different frameworks involved such as openmp,mkl,mkldnn etc. Thus I am a little confused on how to force PyTorch to use one thread. Any ideas?
@yf225
Yes, I added omp_set_num_threads(1) (which has precedence over OMP_NUM_THREADS=1) in the beginning of the main function and still it uses all the CPU cores.
I also removed omp_set_num_threads(1) from the code, and entered OMP_NUM_THREADS=1 in the command line before running the mnist, and still it uses all of the CPU cores.
@afshin67 did you manage limit the thread to 1? having same problem with you, after calling omp_set_num_threads(1) or torch::set_num_threads(1), all cores still being occupied. my libtorch downloaded from PyTorch official website too.
My PC has 8 cores, results of running torch::get_num_threads() and omp_get_max_threads() are difference. Also, calling torch::set_num_threads(1) made no effect on omp_get_max_threads()
Hi, was there any progress on this problem? We have a similar problem with Pytorch 1.1 with C++ frontend. On Python, I am able to set the thread limit to 1.
I’m getting the same problem on my C++ torch project in Windows (I’m building as a DLL and calling it from a dotnetcore console app). I have set both the ‘OMP_NUM_THREADS’ and the ‘MKL_NUM_THREADS’ environment variables to 1. I have also tried ‘#include “ATen/Parallel.h”’ and ‘at::set_num_threads(1);’ at the top of my code. Still running the app spawns 14 threads and maxes out all 8 cores. Has anyone solved this on Windows?