Asynchronous loading for data

Dear Community,

There is a CPU-intensive data preparation task in the customized dataloader and I don’t want this to block the training procedure. The natural way to do this is to introduce a buffer in the background so that workers can get the data from the buffer and retrieve them on time. It seems that the build-in dataloader multithreading doesn’t hold this task very well, as one worker running out of buffer will block other workers from feeding data (reference). Is there a good way to do this?