I do not understand how to use datapipelines in multi-gpu training. Specifically, I have following questions:
- Where should shuffling happen? In datapipline or in dataloader?
- Is it correct to use DistributedSampler with dataloader + datapipline?
- Is there any examples of DDP + datapipline usage?