Error while Multiprocessing in Dataloader

sneiman · August 4, 2022, 6:50pm

FWIW - if using pytorch-lightning the suggested import solution did not help. In my case, a custom data set was generating ALL data on the fly. Altering it to pregenerate substantial data caused the issue to go away, even with non-zero workers.

liyz15 · October 19, 2022, 12:53pm

Hi sneiman, could you please add more details about “generating ALL data on the fly” and “pregenerate substantial data”? I’m also using pytorch-lightning and this is really a trouble.

sneiman · October 19, 2022, 1:19pm

My initial synthetic data set generated the data on demand – meaning the data set did not have any data ‘pre cached’. Like a generator in python. The data was not created until the *getItem() call.* This had the problem, unless I set the number of workers to 0.

I guessed that perhaps not having data available was confusing the splitting up of the data job among a number of workers. So, I changed the dataset to create all of the data when the constructor was called. This solved the problem – I usually run this with 8 workers now.

Hope this helps.

sneiman

liyz15 · October 19, 2022, 1:39pm

Thanks for your reply! This is curious since as far as I know the dataloader workers will prefetch batches automatically. I could not verify this as I have a lot of videos that will not fit into memory. I’ll try to dig deeper. Thank you again.

sneiman · October 19, 2022, 2:02pm

I thought it was odd as well – I had imagined the data loaders simply make lots of calls on getitem(). However, PTL does do a lot of its own multiprocessing management, particularly with DDP.

Good luck,

s

Justin_Wang · March 16, 2023, 2:52am

In the dataloader, “persistent_workers= True” works for me.

Noah_Bishop · April 8, 2023, 1:32pm

Thank you bro, it’s worked!

ericcartman · April 18, 2023, 5:39pm

This worked for me, too. Thank you!

PoyrazTahan · April 25, 2023, 6:49pm

I was not using any tqdm and having a very weird crash where it was running the code completely unrelated to the call. BUT just because I imported like the suggestion solved my crash as well.
Howww?

kpusztai · July 28, 2023, 4:02pm

This also fixed my problem, already had the from tqdm import tqdm and that did not solve it but this did. Running in a docker container

hnuzhy · September 2, 2023, 6:04am

This worked for me. Thank you very much~