I am running the below script (which sets the manual seed as 1 for both cpu and gpu), but it does not give me reproducible results for gpu (for cpu it works fine), any known issue or am I missing something ?
Currently, I am also facing the similar issue of reproducibility on pytorch.
As suggested, I tried disabling cudnn. But pytorch slows down the execution by 5-10x.
With cudnn disable, I am still not able to reproduce the results. I found out that the issue is because of torchvision transform file. When I disable the following lines in the code,
then I am able to reproduce the same results. Setting seed using random.seed() doesn’t work either. Is this behavior expected or am I doing something wrong ?
I tried that workaround in the simple case of the cifar10 tutorial: http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
after having added only the Flip transform line:
I also inserted a seed initialization:
and I even disabled the shuffle.
I am still not able to reproduce the results. Any other idea?
It seems that ransforms.RandomCrop() and transforms.RandomHorizontalFlip() used the “plain” python seed, not the torch one. I fixed the reproducibilty issue with:
Reproducibility is conserved while enabling shuffle but not with num_workers>0.
Then a new model was initialized to train on next fold. Finally there are 10 results respect with to 10 folds.
I can reproduce same results if I start the program from the very beginning (from the first fold).
But when I tried to reproduce the 7th fold result only with loading the 6th state that saved during previous training, the result was different from that got during previous training.
I tried to disable CUDNN and set
I didn’t manage to reproduce it.
But the results with loading 6th states are same,
and I can reproduce the same result as the the results got in from the very beginning training by loading states on CPU.
You might want to check the reproducibility section of the doc.
In particular, there are few operations that are inherently non-deterministic and so you won’t be able to get reproducible results if you use them.
I’am facing a weird situation where on consecutive runs of a script across multiple gpus results are reproduced but running the same script after a few days is now giving different scores which are now reproducible on short term maybe.