Num_works cause insufficient shared memory

I try to training imagenet on four 3090 graph cards with 4096 batchsize. However, when I set num_works bigger than 8, this erorr shut down my training program.

Did you try to increase your shared memory limit or checked which limit is currently set?

I am working on a leased server. The shared memory of virtual environment is 20G. I can’t change this setting. Is there any method to avoid this issue?

Assuming you are indeed using all 20GB of it, I think the only workaround would be to reduce the shared memory usage by reducing the number of workers.