DataLoader worker thread-local state

vadimkantorov · August 13, 2020, 11:36pm

What’s the proper place for storing per-worker thread-local state, like buffer tensors? Is it dataset object (since it’s being copied over to all different processes and is supposed to not be shared)?

E.g. my __getitem__ would like to store some buffers to be reused in the next __getitem__ call in order to save some CPU NumPy allocations

albanD · August 14, 2020, 12:29am

cc @VitalyFedyunin that has been looking at the DataLoader recently.

VitalyFedyunin · August 28, 2020, 7:56pm

My best suggestion is to wait and use https://github.com/pytorch/pytorch/pull/35795 functionality,
Second best is to do on-demand buffers allocation