Towards Reproducible Training Results

In Manual seed cannot make dropout deterministic on CUDA for Pytorch 1.0 preview version, it is mentioned that seeding does not help in reproducibility when a model contains modules like nn.Dropout . In that forum, a potential solution would be using torch.set_rng_state(). Here are my findings:

  1. LeNet5 does not contain dropout layer while seeding failed to give reproducible results. Seeds are configured as below.
    os.environ['PYTHONHASHSEED'] = str(args.seed)

    # Set seed for pytorch
    torch.manual_seed(args.seed)  # Set seed for CPU
    torch.cuda.manual_seed(args.seed)  # Set seed for the current GPU
    torch.cuda.manual_seed_all(args.seed)  # Set seed for all the GPUs
    cudnn.benchmark = False
    cudnn.deterministic = True
  1. When torch.nn.DataParallel is used, the results among different runs varies slightly more obvious.

I would be grateful if you could explain why these could happen.

Which PyTorch version are you using and did you follow the steps described in the reproducibility docs? Since the issue you’ve linked is quite old I would assume things have changed already and the described CUDA seeding issue was fixed.

Hello @ptrblck,

Thank you so much for your reply.

I am using pytorch 1.13.0 and I did follow the steps shown in Reproducibility — PyTorch 1.13 documentation.

Here are some clues:

  1. I noticed that seeding as shown above kept data sequence fixed (with dataset shuffle and random transforms on samples).

  2. In addition, seeding itself guaranteed the model (LeNet5 in my case) to be initialized in the same way during different runs.