Sharing dataset between subprocess when using DistributedDataParallel

Can you use PyTorch DataLoader? If you implement the __getitem__ function, the batches will be lazily read into memory. Each DDP replica will then have one DataLoader, and each DataLoader will load the data lazily, so there shouldn’t be as much memory pressure.

Relevant Forums Post: How to use dataset larger than memory?