As I see it, the standard use of the DataLoader class is a series of operations:
- Call the dataloader.
- It performs some loading operations and returns the result.
- Then, the result of the dataloader is used for some operations by the main code.
My issue with this is that the loading operations are blocking and take sometimes significant portions of time. Conceivably, though, the loading operation could be performed ahead of time such that the result was stored in memory associated with the dataloader. Then, when the dataloader was called, the result could simply be returned with no processing (already in memory). Then, the next batch of data could be processed and loaded into memory WHILE THE MAIN CODE WAS EXECUTING, IN PARALLEL.
- Call the dataloader, and it immediately returns result stored in buffer
- Result of dataloader is used by main code, and simultaneously a separate process loads next data into the buffer
This should be possible using multiprocessing, essentially creating a buffer. However, using the multiprocessing functionality of the DataLoader I don’t think it is.
Does anyone know of a way to do this cleanly using existing Pytorch libraries? I’m confident I can do it with python’s multiprocessing module but I’d prefer to avoid that since working with tensors across mutiple processes can get a bit hairy.