Relation between num_workers, batch_size and epoch in DataLoader?

num_workers is not related to batch_size. Say you set batch_size to 20 and the training size is 2000, then each epoch would contain 100 iterations, i.e. for each iteration, the data loader returns a batch of 20 instances. num_workers > 0 is used to preprocess batches of data so that the next batch is ready for use when the current batch has been finished. More num_workers would consume more memory usage but is helpful to speed up the I/O process. Please refer to this thread for more discussions on this problem.

21 Likes