Suppose I am using a quite big Dataset which I want to load once. Using multiprocessing.pool I spawn several different processes which train different models with different hyperparameters, so this is not a distributed learning problem but rather a parallelization of different models. What I want is that each subprocess has access to the once loaded dataset and can build its own DataLoader to receive batches from.
How would one do this in Pytorch in the best way? I can’t figure out a way that makes the Dataset accessible to all subprocesses (except defining it in every subprocess, which is costly).
Thanks!