I am training my network with multiple datasets for multitask learning, using gradient accumulation from batches of these datasets and then doing the optimization step.
I found out that using multiple data loaders for each dataset is not optimal as each data loader may prefetch the batch and consume GPU memory.
Is there way to have a single data loader for multiple datasets, which alternatively sample from these datasets.
You could write your custom sampler
, which is responsible to create the indices passed to Dataset.__getitem__
selecting the sample, and iterate the different datasets according to your logic.
The common use case is also to load and process the samples on the CPU and to move the processed batch to the GPU inside the DataLoader
loop. If you stick to this approach multiple DataLoader
s won’t consume any GPU memory.