How do i place my data when using DDP?

i want to use DDP with multiple computers. And i know that the “torch.utils.data.distributed.DistributedSampler” can random load the dataset. The question is that i should palce the same dataset at each computers? Can I assign a specified part of the dataset to each computer in order to release the storage pressure?

DistributedSampler should work well with DDP. And I believe split storage should be fine. Have you tried that?

cc Dataloader folks to confirm: @VitalyFedyunin