Can multiple training runs, all reading the same data on disk, slow each other down?

The torchvision CIFAR10 dataset is loading the data into the memory (as it’s quite small), so I wouldn’t expect to see a huge speedup using this large number of workers. Generally, 48 workers for each DataLoader sounds quite excessive, so I would recommend to play around with this value and check, if you are creating the slowdown. This post is also a very good reference when it comes to data loading bottlenecks.