Updating custom DataLoader class fields with multiple workers

I wrote a pretty involved custom DataLoader class. It works fine if I only use “0” workers, but as soon as I try to have 1+ workers, I run into issues.

In my __getitem__(self, index) method, in addition to returning the data and labels, I also want to cache intermediate results, so that they can be reused in the future. For this purpose I have a dictionary self.cache[index] where I store these results.

The puzzling thing is not that two workers might try to read/write the same dictionary entry simultaneously. Rather, self.cache doesn’t seem to be shared by workers at all. For example, if I tried to access the field from the training script (i.e. dataset.cache[index]) there is nothing there.

Is there a detailed guide/documentation on how to create a workaround for this? I’m obviously new to this …

You could try to use a shared array as described here.

Thanks. That did the trick. Also glad to see that the Manager object supports many other types types (list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Barrier, Queue, Value and Array).