Reproducing Pytorch Model Parameters

csl · March 24, 2022, 10:06pm

I’m pretty new to Pytorch but I was trying to run two consecutive runs of the same model by loading a checkpoint at epoch 0. Is it normal for the model parameter outputs to differ between these two runs? Is it possible the optimizer I am using (AdamW) may create a difference in parameters between iterations?

Matias_Vasquez · March 24, 2022, 11:00pm

You can try setting a seed for the random parameters.

Many (if not all) layers rely on random initialization.

If you do

torch.manual_seed(some_number)

Then it should be fine

https://pytorch.org/docs/stable/notes/randomness.html

csl · March 24, 2022, 11:51pm

Thanks for your reply- would you recommend doing this even with checkpoint loading enabled? I was under the impression that loading checkpoints would make sure the initial random initialization was the same for both runs.

ptrblck · March 25, 2022, 12:10am

I don’t know what you’ve stored in the “checkpoint”, but usually the state_dicts are stored only without any information about the seeds etc. In any case, you would still need to reseed the code as mentioned by @Matias_Vasquez as loading the state_dicts alone won’t seed the code.

csl · March 25, 2022, 12:52am

Great, thanks to both of you for your input! I’ll try that out.

csl · March 25, 2022, 1:21am

It turns out a few of my model layers were non-deterministic but I was able to track them down using the line torch.use_deterministic_algorithms(True)

Thanks @Matias_Vasquez for the helpful link on reproducibility