Why my DistributedDataParallel is slower than DataParallel if my Dataset is not loaded fully in memory

Hey @ammary-mo, how did you measure the delay? Since DataParallel and DistributedDataParallel are only involved in the forward and backward passes, could you please try using elapsed_time to measure data loading, forward and backward delay breakdowns? See the following discussion. It’s possible that if multiple DDP processes try to read from the same file, contentions might lead to data loading perf regression. If that’s the case, the solution would be implementing a more performant data loder.

cc @VitalyFedyunin @glaringlee for DataLoader and DataSampler.

1 Like