Dataloader - how to parallelize loading several PNGs per item?

What is the fastest way to load several PNGs per item?

I tried parallelizing the loading by:

  • using asyncio, this doubled the speed but hangs one of our units tests
  • using multiprocessing, but the Dataloader complaints it is alrady using it and doesn’t like children

Thanks!

Except at the beginning of an epoch, the idea is that DataLoader processes pre-prepare batches (in multiprocessing) so you’d hide the latency. That said, if you’re not doing a lot of Python computation (needing the GIL) in it, you could try threads.

Best regards

Thomas