CPU usage issue

Hi, guys
when I run my model on the CPU, the model occupies all cpu cores in default. And I export the OMP_NUM_THREADS=1, it almost takes the same time for the same input. So I wander that why the former which use all cpu cores makes no improvement over the latter?

And I attempt to install from source or binary, but no change. And the OS is CentOS Linux release 7.3.16.11

What’s your PyTorch version? Also, did you try running e.g. with 4 OMP threads? I think the problem appears because when you’re using all the cores, they’re competing over cache space and can’t proceed as effectively.

1 Like

My pytorch version is 0.1.10. And as you say, set the OMP_NUM_THREADS=4 and it works well. Thanks for your reply.