Training faster with single gpu

ptrblck · January 28, 2022, 5:55am

This post gives a great overview how to handle data loading bottlenecks.
In particular this section might be interesting for you:

Beyond an optimal number (experiment!), throwing more worker processes at the IOPS barrier WILL NOT HELP, it’ll make it worse. You’ll have more processes trying to read files at the same time, and you’ll be increasing the shared memory consumption by significant amounts for additional queuing, thus increasing the paging load on the system and possibly taking you into thrashing territory that the system may never recover from