Guidelines for assigning num_workers to DataLoader

apaszke · March 1, 2017, 9:55pm

Having more workers will increase the memory usage and that’s the most serious overhead. I’d just experiment and launch approximately as many as are needed to saturate the training. It depends on the batch size, but I wouldn’t set it to the same number - each worker loads a single batch and returns it only once it’s ready.

num_workers equal 0 means that it’s the main process that will do the data loading when needed, num_workers equal 1 is the same as any n, but you’ll only have a single worker, so it might be slow