Why is 'sampler.set_epoch(epoch)' needed for DistributedSampler?

I’ve seen various examples using DistributedDataParallel where some implement the DistributedSampler and also set sampler.set_epoch(epoch) for every epoch in the train loop, and some that just skip this entirely. Why is this, and is it really needed for the distributed training to execute correctly?

Based on the docs it’s necessary to use set_epoch to guarantee a different shuffling order:

In distributed mode, calling the set_epoch() method at the beginning of each epoch before creating the DataLoader iterator is necessary to make shuffling work properly across multiple epochs. Otherwise, the same ordering will be always used.