For our usecase we need to do the job on one thread (preferable the main one). For all other workflows we just call
torch.set_num_threads(1) and this gets the job done .
But QNNPack (more specifically the quantized Convolution) doesn’t respect this setting, since it uses
In our tests we get good results by making that function return
Is there a way to set the
num_threads of that pool without recompiling pytorch?
If not, what’s the minimal acceptable refactoring to make that configurable or respect the