Result reproducibility in PyTorch

Megh_Bhalerao · July 10, 2020, 10:01am

Hi everyone,
I have a question with regards to reproducibility of results in PyTorch. In my main file (trainer.py file), I am doing the following:

torch.cuda.manual_seed(seed_val)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

I am doing this ensure that my results match atleast upto some precision across different runs.

So my question is, do I have to write these 3 lines in all my files which import pytorch, for example my file where I have defined the neural network model, or maybe file where I have defined my loss function, the data loader files and so on, or is it enough to include these 3 lines ONLY in my trainer.py, where I am instantiating (and initializing), the loss function, the NN model, and other stuff?

Please help me out.

Thank you.

ptrblck · July 12, 2020, 2:36am

It should be sufficient to set the seed after the first import of PyTorch.
As long as the control flow stays the same (i.e. the calls to the pseudo-random number generator are the same), you should get the same outputs (besides of course the limitations mentioned in the reproducibility docs).