Non Reproducible result with GPU

I am running the below script (which sets the manual seed as 1 for both cpu and gpu), but it does not give me reproducible results for gpu (for cpu it works fine), any known issue or am I missing something ?

if args.cuda:

If you’re using cudnn, if I remember correctly, some of their kernel are non-deterministic.
Maybe disabling cudnn will help.

1 Like

Currently, I am also facing the similar issue of reproducibility on pytorch.

As suggested, I tried disabling cudnn. But pytorch slows down the execution by 5-10x.

With cudnn disable, I am still not able to reproduce the results. I found out that the issue is because of torchvision transform file. When I disable the following lines in the code,
transforms.RandomCrop(32, padding=4),

then I am able to reproduce the same results. Setting seed using random.seed() doesn’t work either. Is this behavior expected or am I doing something wrong ?

Workaround ==> If you set num_workers=0 in the dataloader, it should reproduce.

1 Like

I tried that workaround in the simple case of the cifar10 tutorial:
after having added only the Flip transform line:
I also inserted a seed initialization:
and I even disabled the shuffle.
I am still not able to reproduce the results. Any other idea?

It seems that ransforms.RandomCrop() and transforms.RandomHorizontalFlip() used the “plain” python seed, not the torch one. I fixed the reproducibilty issue with:
import random
Reproducibility is conserved while enabling shuffle but not with num_workers>0.

1 Like

I got the same problem.
My model was trained on 10-fold cross validation. I tried to save all the random states that may contribute to reproducibility every one fold was trained, the state including:

    state = {
        'fold_iter_idx': fold_iter_idx,
        'torch_rng_state': torch.get_rng_state(),
        'cuda_rng_state': torch.cuda.get_rng_state(),
        'np_rng_state': np.random.get_state(),
        'random_rng_state': random.getstate()

Then a new model was initialized to train on next fold. Finally there are 10 results respect with to 10 folds.
I can reproduce same results if I start the program from the very beginning (from the first fold).
But when I tried to reproduce the 7th fold result only with loading the 6th state that saved during previous training, the result was different from that got during previous training.
I tried to disable CUDNN and set

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

I didn’t manage to reproduce it.
But the results with loading 6th states are same,
and I can reproduce the same result as the the results got in from the very beginning training by loading states on CPU.

I am not sure whether I make my issue clear. :sweat_smile:


You might want to check the reproducibility section of the doc.
In particular, there are few operations that are inherently non-deterministic and so you won’t be able to get reproducible results if you use them.

1 Like

I’am facing a weird situation where on consecutive runs of a script across multiple gpus results are reproduced but running the same script after a few days is now giving different scores which are now reproducible on short term maybe.