Num_workers in DataLoader

Vsevolod_Vikulin · April 10, 2017, 9:46pm

Hello everyone! I have a very huge data of image, so huge that 10 seconds need to create one batch.
For solving this problem i decided to use parameter “num_workers” in torch.utils.data.DataLoade. For example:
trainset = datasets.ImageFolder(data_dir, data_transforms)
train_loader = torch.utils.data.DataLoader(trainset, batch_size=256, shuffle=True, num_workers=4)
But It works only for small batch size, for example for 32 workers, but for 64 it does not work. For batch size like 64 it works only for num_workers=0
Can you help me and explain for me such phenomena?

smth · April 10, 2017, 10:56pm

maybe you are running out of memory beause of your huge data

shicai · April 10, 2017, 11:42pm

what is the relationship between batch_size and num_workers?
does num_workers have an impact on gpu memory?

Vsevolod_Vikulin · April 10, 2017, 11:50pm

Yes It is possible, but I do not know how to fix that

Vsevolod_Vikulin · April 10, 2017, 11:54pm

Sorry, I do not know it. I thought that num_workers it is num proccess. Processes creating your batch and because of this processes working on cpu, in my opinion.

zhao_darren · June 22, 2017, 7:09am

Hi , have you fixed the problem of num_workers? I have met the same problem,can you share more details about your machine, your OS etc.?

ConanCui · July 18, 2017, 2:26pm

I have doubts about how to set the Num_workers in DataLoader and how does it work , if I set a larger Number ,can I shorten the time of training my model , and can any body explain how it speed up the process of training?

jiqing_zhan · December 16, 2017, 6:20pm

Have you solved the problem?I had the same problem.

harsv · December 20, 2017, 5:39pm

I just put my comments on another thread: Guidelines for assigning num_workers to DataLoader

yhkim · March 22, 2018, 12:02am

Hello, I’m PyTorch beginner, but I want to share my case.

I supposed that a worker is assigned samples as the batch size in multi-thread jobs.

Let’s say,

N: total number of samples in dataloader
B: batch size
C: num_workers

if N = B * C would be appropriate when you determine the number of batch size and workers.
if N < B * C, then some threads didn’t work.
if N > B * C, then all threads works well.

Here is my test case,
I tested using vary batch size and num_workers.
Total number of samples for DataLoader = 127 (N)
Batch size (B), Num workers ©

myDataset = classMyDataSet() # classMyDataLoader is inherited class of “torch.utils.data.DataSet”
myLoader = torch.utils.data.DataLoader(myDataset, batch_size=B, shuffle=False, num_workers=C, pin_memory=True)
bx, by = next(iter(myLoader)) # at this line, dataloader started to work

Case 1. B = 1 and C = 0

Works well with one core
Only 1 sample was processed

Case 2. B = 1~10 and C = 1 ~ 12 (due to my available threads are 12)

Works well with C core
Only B*C samples were processed (i.e., N < B * C case)

Case 3. B = 11 and C = 12

Works well with C core, however a core processed only 7 samples
127 samples (i.e., N) were processed (i.e., N > B * C case)

Case 4. B = 20 and C = 12

Works only 7 cores (6 cores processed 20 samples, and 1 core processed 7 samples), meanwhile 5 cores didn’t work
127 samples (i.e., N) were processed (i.e., N > B * C case)

Case 5. B = 60 and C = 12

Works only 3 cores (2 cores processed 60 samples and 1 core processed 7), meanwhile 9 cores didn’t work
127 samples (i.e., N) were processed (i.e., N > B * C case)

Case 6. B = 127 and C = 12

Only a core processed 127 samples, 11 cores didn’t work
127 samples (i.e., N) were processed at (i.e., N > B * C case)

I have question about this issue.
Intuitively, I expected that the multi-threads work for samples as batch size.
However, it worked B * C samples when I call the batch loader (i.e., at the line “bx, by = next(iter(myLoader))”.
Is there any reason?

Thanks to PyTorch developers and contributors!

neulrl · May 4, 2018, 8:04am

hello! I don’t understand the meaning of “total number of samples in dataloader”. Is that the number of data in a dataset we load by “Dataloader”? could you give an example?

singleroc · May 15, 2018, 8:38am

Seems like Pytorch set up C threads and each of the threads is going to read B samples from the Dataloader. Real batch size equals C*B in one iteration.