Can multiple training runs, all reading the same data on disk, slow each other down?

ptrblck · December 13, 2021, 9:22pm

The torchvision CIFAR10 dataset is loading the data into the memory (as it’s quite small), so I wouldn’t expect to see a huge speedup using this large number of workers. Generally, 48 workers for each DataLoader sounds quite excessive, so I would recommend to play around with this value and check, if you are creating the slowdown. This post is also a very good reference when it comes to data loading bottlenecks.