More gpu accelerate the speed of dataloader?

In my experiment, I test different num of gpus and num_wokers in dataloader. Here is the result

num gpu batch size num worker one batch time(get data and forward backward) one batch time(get data)
2 512 128 0.8 0.3
2 512 32 0.8 0.3
2 512 16 0.8 0.3
2 512 8 0.8 0.3
2 512 2 0.8 0.3
2 512 1 3.7 3.2
8 512 32 0.3 0.08
8 1024 32 0.55 0.14

There are two weird thing. 1) is more gpu accelerate the speedup of dataloader. 2) when increase the num_worker, it doesn’t accelerate the speed of getting data.

Interesting. Are these experiments conducted with pin_memory=True in the DataLoaders ?

1 Like

Could you share the code you’ve used to measure the data loading speed?
Note that too many workers might degrade the performance as explained in this post.