Today I want to test the CPU utilization of pytorch matrix multiplication.So I run the following benchmark:
import timeit runtimes =  threads =  + [t for t in range(2, 49, 2)] for t in threads: torch.set_num_threads(t) r = timeit.timeit(setup = "import torch; x = torch.randn(1024, 1024); y = torch.randn(1024, 1024)", stmt="torch.mm(x, y)", number=100) runtimes.append(r)
Howerver I found a weird problem : CPU cores are very unevenly loaded.One CPU core has a very high load and the others have a very low load.Just like this
I compile the pytorch of version 1.4.1 with gcc/7.4.0 and I set the environment variable USE_OPENMP=1.What caused this ? Please tell me ! Thanks!