I have a simple (fairly small) PyTorch model (a feedforward neural network with 3 hidden layers of 64 elements) that I use a lot for inferencing. There are other processes, and I want inference to only happen on one thread. Inferencing (calling model->Forward(v) happens on CPU thread.
This thread somehow spawns 16 subthreads, which I don’t like. (There is lot of overhead, and cpu usage on other threads causes tasks there to slow down.) So I call:
torch::set_num_threads(1); | |
---|---|
torch::set_num_interop_threads(1); |
which should prevent multithreading. Somehow this doesn’t work, and still 16 threads are created. Any idea what is causing this? Is this a bug?