Everywhere I checked, I saw the note:
To use multi-threading with numpy random in the DataLoader, use the worker_init_fn with torch.initial_seed()
I’m trying to understand exactly what’s happening with this code snippet:
worker_init_fn=lambda _: np.random.seed(int(torch.initial_seed())%(2**32-1)))`
I know that np.random.seed() requires integer output. So converting the long from torch.initial_seed, and finding modulo 2^32 -1 will give a seed between (0 and 2^32-1)
Does this mean that each worker is initialized with this number as seed?
Or does it mean that each worker is initialized with this number + worker_id as seed?
And does the worker_id change between epochs? (I’m thinking it should, as it seems to be a new thread called by the main python thread…?)