I wrote a pretty involved custom DataLoader class. It works fine if I only use “0” workers, but as soon as I try to have 1+ workers, I run into issues.
__getitem__(self, index) method, in addition to returning the data and labels, I also want to cache intermediate results, so that they can be reused in the future. For this purpose I have a dictionary
self.cache[index] where I store these results.
The puzzling thing is not that two workers might try to read/write the same dictionary entry simultaneously. Rather,
self.cache doesn’t seem to be shared by workers at all. For example, if I tried to access the field from the training script (i.e.
dataset.cache[index]) there is nothing there.
Is there a detailed guide/documentation on how to create a workaround for this? I’m obviously new to this …