Multiple training runs on the same dataset at the same time?

meshghi · February 14, 2022, 6:30pm

Hi, I want to run my model with different configurations. I don’t want to wait for training with one configuration to be done and then start another one. I want to run all of them at the same time. I don’t have a problem with computation power. But, I don’t want to create multiple versions of the same dataset, and I want to use only one copy of the dataset for all trainings. Therefore, I don’t know if pytorch dataloader in one training waits if the dataset is locked by another training?

tom · February 14, 2022, 7:44pm

Typically the dataloader will only use read-only access, so it should work to have multiple processes read the same dataset. That said, if the bandwidth of the underlying storage becomes the bottleneck, you would necessarily observe a slowdown. Also, caching and the storage characteristics could impact the performance of how the dataset is read (but then, the most standard dataloader setups read random samples from the dataset, too).

Best regards

Thomas