Multi workers specified by num_workers load samples to form a batch, or each worker load a batch respectively in DataLoader?

yangwenhuan · August 21, 2019, 6:40am

I mean, supposed batch_size=4 and num_workers=2, which of the following may match the runtime case?

#1
worker1: load sample1, sample2
worker2: load sample3, sample4
batch1 contains sample1, sample2, sample3, sample4

#2
worker1: load batch1
worker2: load batch2

alex.veuthey · August 21, 2019, 7:10am

Apparently, it would be case #2. See this answer.

yangwenhuan · August 21, 2019, 8:09am

I have read the mentioned post, but how can I proof it? or any PyTorch source code show the fact?

alex.veuthey · August 21, 2019, 8:45am

The source code for the DataLoader logic is too complex for me, but I see two reasons why #2 would be the approach used in practice:

@apaszke said it,
creating a batch from the two workers would require waiting for them to finish their task (getting data) before returning the batch, which is inefficient and (I guess depending on the architecture, data structure, storage etc…) useless compared to a single process loading sample1, sample2, sample3 and sample4 and then returning the batch.

What is your use case?

yangwenhuan · August 21, 2019, 8:56am

for your first point, @apaszke only showed the result, but I want to know the reason.
for the second, multi workers handle a batch may not be slower than single worker.

Howl · February 9, 2021, 7:27am

In my test case, I use 200 samples and set batch size is equal to 50. The result shows only 4 subprocess works even if i set num_workers=16(I have 16 CPUs). So I think alex.veuthey is right.