Hi, I am using pytorch multi processing to speed up some data processing in CPU side(in order to feed to GPU faster ), https://pytorch.org/docs/stable/notes/multiprocessing.html#multiprocessing-best-practices
In each processing func, I also use multi-worker data loader to speed up the data loading processing time.https://pytorch.org/docs/stable/data.html
suppose I have N
processing, each has M
dataloader worker, so total NxM
underneath threading there.
If in my dataloader, I want to get all data in a sequential way, which means __get_item__(self, idx)
in data loader will will have a index as para and different processing and dataloader worker processing different index, how can I ensure they do not process duplicate or miss process some?