Reusing worker processes across different DataLoader instances

I’m developing a codebase for continual learning scenarios, and these scenarios often have a set of individual datasets, with a separate one for each ‘task’ being learnt. These task datasets are then iterated over in order, with (typically) multiple epochs being performed for each dataset.

To iterate over these datasets I have been using the PyTorch DataLoader class, and this works well within each dataset as there is no latency between epochs. However, when moving from one task dataset to the next, as the DataLoader is recreated (and hence so are its workers, I imagine), when the number of workers is large there is a delay (on the order of 8s for 128 workers).

Is there some way to make it so these workers persist across these separate DataLoader instances? I know there is the “persistent_workers” keyword argument for DataLoader, but I believe this only applies to that DataLoader instance and is only intended to preserve workers across epochs, not across datasets.

I can probably refactor my code such that different tasks’ datasets are in the same dataset object, but this seems like it could be difficult…