DataLoader prefetches batches for the next epoch after being consumed once?

When I have many small epochs, can it prefetch the first batches for the next epoch? I would assume this could be done if persistent_workers = True

Yes, persistent_workers=True will not shut down all workers after the epoch end and will not restart them for the new epoch and the batches will be continuously loaded.

I guess there is a problem with that given that we often have sampler.set_epoch(epoch) especially in distributed context before the new epoch, so prefetched batches need to be discarded somehow and the sampler needs to be re-evaluated.

That’s a good point and I guess I’m wrong.
Based on this code the reset method is called, which seems to grab the new sampler.
You could test the behavior of the sampler with a pre-defined code snippet and check if persistent workers behave the same or not.