Setting seeds does not give reproducible code

shrbrh · February 17, 2023, 2:28am

Hi, I am trying to reproduce the results of my pytorch code, but it’s giving different values every time I run it. I have set the seed as follows:

seed = 42
numpy.random.seed(seed)
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
tf.random.set_seed(seed)
tf.experimental.numpy.random.seed(seed)
tf.compat.v1.set_random_seed(seed)

# When running on the CuDNN backend, two further options must be set
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = True

#For tensorflow
os.environ['TF_CUDNN_DETERMINISTIC'] = '1'
os.environ['TF_DETERMINISTIC_OPS'] = '1'
	
# Set a fixed value for the hash seed
os.environ["PYTHONHASHSEED"] = str(seed)

Did I miss anything?

eqy · February 17, 2023, 2:43am

Have you narrowed it down to a particular operation or operation(s) that exhibit non-determinism? Note that the docs also recommend torch.backends.cudnn.benchmark = False for determinism: Reproducibility — PyTorch 1.13 documentation

shrbrh · February 17, 2023, 6:20am

@eqy I tried setting torch.backends.cudnn.benchmark = False but it’s still not working.
I am actually training a GAN, and the loss values and discriminator probability outputs for real/fake are different every time I run the code.

ptrblck · February 17, 2023, 7:34am

Could you try to narrow down the part of the code which creates the non-deterministic values and post the code snippet here, please?

shrbrh · February 17, 2023, 12:01pm

@ptrblck Actually I am getting non-deterministic values for multiple outputs. Upon searching for solutions, I got a post here that says setting num_workers = 0 for Dataloader works. I tried that but it still doesn’t work for me. But by making a few changes I am thinking it has something to do with the Dataloader part of my code. Here is the snippet:

content_iter = iter(data.DataLoader(
    content_dataset, batch_size=4,
    sampler=InfiniteSamplerWrapper(content_dataset),
    num_workers=16))
style_iter = iter(data.DataLoader(
    style_dataset, batch_size=number_of_styles,
    sampler=InfiniteSamplerWrapper(style_dataset),
    num_workers=16))

Here, number_of_styles is 19.
And the InfiniteSamplerWrapper method is defined in a different file as follows:

import numpy as np
from torch.utils import data

def InfiniteSampler(n):
    # i = 0
    i = n - 1
    order = np.random.permutation(n)
    while True:
        yield order[i]
        i += 1
        if i >= n:
            np.random.seed(1)
            order = np.random.permutation(n)
            i = 0


class InfiniteSamplerWrapper(data.sampler.Sampler):
    def __init__(self, data_source):
        self.num_samples = len(data_source)
        #print("Num samples:",self.num_samples)

    def __iter__(self):
        return iter(InfiniteSampler(self.num_samples))

    def __len__(self):
        return 2 ** 31

If I leave np.random.seed() as it is without specifying any number inside the braces, I get all different values from the beginning. Whereas, if I use np.random.seed(1) I get the exact values for the first iteration, then it starts to change from the second iteration.

ptrblck · February 17, 2023, 5:38pm

Your code is unfortunately not executable so that I won’t be able to debug it. If you are using multiple worked you would have to use the worker_init_fn to seed third party libs such as numpy.

nwn · February 27, 2023, 2:56pm

Just a wild guess, therefor no guarantee that it will work. Have you tried setting the dataloader seed?