Dataset in shared memory

rvandeghen · September 8, 2022, 11:22am

Hi,
Imagine that you have enough RAM to load and store all the images and the targets of your dataset in a list during the __init__ of your Dataset class. When training a model using torchrun, each gpu will create its own Dataset, thus leading to N*dataset_size of RAM used.
Is it possible to share the images and targets lists to all processes without replications, so that the memory consumption is still dataset_size ?

FYI, my Dataset inherits from COCO.