List.pop() can't work in Dataloader

I’m training videos. Each video frames are in one folder, and I have ten videos. I let ‘indice’ which comes from Dataloader to decide which folder to use. So I need another random index to decide which frame in a specified folder to use. And I want to make full use of every frame in one folder without repetition. So I set a random sequence. Every time I pop a index from the sequence as the index of the frame, but list.pop() doesn’t work well as follows:


I’ve pop() many times, but the sequence remains unchanged. How deal with this, please?

Could you try with num_workers=0 instead? I think multiprocessing breaks your intended behaviour.

The correct approach for this kind of problem is using a custom sampler, that will give you indices for all batches. This way, the dataloader can have multiple processes and be faster, it will ask the sampler (only one instance) for indices, and transfer those indices to the dataset, therefore loading the correct data.

More detailed explanations in the doc.