DDP without Distributed Sampler

Rakshit_Kothari · March 9, 2021, 2:58am

What happens when we do not give a distributed sampler? Does it essentially iterate over all samples with as many ranks as we have?

rvarm1 · March 9, 2021, 5:21pm

If the data is not sharded across different DDP ranks (i.e. with a distributed sampler or some custom sharding logic that you may have), then yes, DDP will use all samples on all ranks (in your example I guess there’s 2 ranks).

This is why in general you want want to partition your data appropriately across ranks to ensure different model replicas get different data.

Rakshit_Kothari · March 9, 2021, 8:34pm

Thanks Rohan, this cleared it out for me.