CPU thread use is crazy despite running model on GPU

I am running my model on the GPU. However data loading (loading only, not preprocessing) is happening on the CPU. I’m wrapping a dataset with the DataLoader class, specifying 4 workers.

When I run ps aux | grep <username> I see my 5 processes: the parent that is running the script and the 4 workers doing the data loading.

When I check the number of threads used by each process (ps -o nkwp <PID>), I see that each worker (child process) is using 2 threads (which makes sense as there are 2 threads per core). The parent process, on the other hand, is using 191 threads. My machine has 40 cores (should have 80 threads, I thought), so I don’t understand how the parent is using so many threads.

I don’t understand what is commanding 191 threads in the parent. My model is running entirely on the GPU, and the data loading is done in separate processes. Why am I using so many CPU resources?