Multiprocessing CPU under-used, only 2 cores are used

I’m having this weird issue where only 2,3 cpu cores are use by torch.multiprocessing. It only seems to happen on our new machine with i9-13900K CPU.
It works fine on our older Xeon CPUs (100% on 50 cores as expected, in the data preprocessing stage).
The code is the same.
I’m using pytorch 2.0.1, both machine running Ubuntu 20.04

Running the same code and param on older Xeon machine.
It uses the all 50 cores as I specified.

And there’s seems to be a global cap on the number of cores used

I observed that when I run one script with 50 workers, each worker uses around 4% CPU, while I run two scripts, each with 50 workers, each worker uses around 1.9%.
The total CPU usage is 3 cores in both cases.