I think you could try to use a BatchSampler
and pass the indices of the entire batch to the Dataset.__getitem__
as described here. Inside the __getitem__
you could then try to use multiple processes to load the samples in parallel.
1 Like