Where to set the manual seed

I have three files in my current pipeline:

  1. net.py that defines a network I want to use
  2. data.py that reads the data and creates a dataloader object
  3. train.py that runs the experiment

I can think of randomness coming in play with setting the initial weights of the network, its dropout regularization, as well as the dataloader object fetching batches.

My question is, where do I need to set the random seed? Do I need to set it within all three files? Or is it enough for it to be set within train.py and then the code will be able to use it when I load in the network and dataloader objects?

Setting the seeds at the beginning of your main script should work if you are concerned about the PRNG usage in other files/modules. The DataLoader object might be a bit more tricky so take a look at the DataLoader section in the Reproducibility Docs.

1 Like