I’m pretty new to Pytorch but I was trying to run two consecutive runs of the same model by loading a checkpoint at epoch 0. Is it normal for the model parameter outputs to differ between these two runs? Is it possible the optimizer I am using (AdamW) may create a difference in parameters between iterations?
You can try setting a seed for the random parameters.
Many (if not all) layers rely on random initialization.
If you do
torch.manual_seed(some_number)
Then it should be fine
Thanks for your reply- would you recommend doing this even with checkpoint loading enabled? I was under the impression that loading checkpoints would make sure the initial random initialization was the same for both runs.
I don’t know what you’ve stored in the “checkpoint”, but usually the state_dict
s are stored only without any information about the seeds etc. In any case, you would still need to reseed the code as mentioned by @Matias_Vasquez as loading the state_dict
s alone won’t seed the code.
Great, thanks to both of you for your input! I’ll try that out.
It turns out a few of my model layers were non-deterministic but I was able to track them down using the line torch.use_deterministic_algorithms(True)
Thanks @Matias_Vasquez for the helpful link on reproducibility