Data loaded multiple times with DistributedDataParallel

I have a fairly simple training script that

  1. Reads data from parquet into a pandas DF
  2. Pushes data into a torch tensor
  3. Uses TensorDataset/DistributedSampler/DataLoader to load data during training
  4. Uses DistributedDataParallel to manage distributed training across GPUs of a single instance.

However I know that when i call mp.spawn(train, nprocs=args.gpus, args=(args,)) the code to read my feature and label data is executed in each process. I’m sure this causes some sort of unnecessary memory/CPU overhead on the machine. Is there any obvious way to avoid this?

Thanks so much!
-Sohrab Andaz