Issues with reproducibility

Hi folks, I’m trying to understand some aspects of reproducibility and it seems that I might have not understood something properly.

I have a toy multilayer perceptron and every time I train and test it I get different results regarding test accuracy.

I’ve seeded all possible sources of randomness I could think of:

seed = 2021
os.environ['PYTHONHASHSEED'] = '2021'
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.backends.cudnn.benchmarks = False
torch.backends.cudnn.deterministic = True
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.set_deterministic(True)

But I still observe this discrepancy in accuracy every time a re-train and test the model.

I suspect that has to do with the fact that every time the model is constructed the weights are randomly initialised, hence, the different results in accuracy.

It seems like none of the above seeds fixes the weights of network for reproducibility, am I understanding this wrong or did I miss sth crucial regarding reproducibility. (And yes I’ve read the docs about reproducibility)

Seeing the code should use the same “random” values to initialize the model.
If you think the model parameters are still different, you could save the state_dict of two different runs and compare the values.

In case the parameters are indeed equal, you could then add debug print statements and try to narrow down which iteration produces different results.

Thanks @ptrblck,
from a simple experiment with numpy and seeds it looks that no matter whether you set the seed or not consequent call to numpy.ranodm.randn will yield different results.

In that case why do we expect whenever we create an instance of the model multiple times (e.g. modelA = MLP(), modelB = MLP()) to have same weights? I would expect that modeA and modelB to have different weights due to random initialisation, no?

We don’t and it’s expected that sequential calls to the pseudo random number generator will yield different results. Otherwise you would be sampling constant values, which is wrong.

Seeding the code makes sure that the sequence of random numbers are equal between runs.

np.random.seed(2809)
print(np.random.randn(10))
> [ 0.23747478  0.56083014 -1.1650379   0.12482746  0.04529476 -0.60137682
 -0.18559365  0.57410855 -0.53776209 -0.03972935]

print(np.random.randn(10))
> [-0.85086083  0.25095493  0.30150578  1.15011173  0.36414967  0.3197194
 -1.60004496 -0.74648366  0.65739336  0.15710605]

np.random.seed(2809)
print(np.random.randn(10))
> [ 0.23747478  0.56083014 -1.1650379   0.12482746  0.04529476 -0.60137682
 -0.18559365  0.57410855 -0.53776209 -0.03972935]

print(np.random.randn(10))
> [-0.85086083  0.25095493  0.30150578  1.15011173  0.36414967  0.3197194
 -1.60004496 -0.74648366  0.65739336  0.15710605]

The Wikipedia article explains it in more detail.

Thanks for the example, I’m not sure how you achieved random numbers to be equal between runs. For instance, in the example below they all seem different to me, no?

>>> import random
>>> os.environ['PYTHONHASHSEED'] = '2021'
>>> random.seed(2021)
>>> np.random.seed(2021)
>>> np.random.randn(3,3)
array([[ 1.48860905,  0.67601087, -0.41845137],
       [-0.80652081,  0.55587583, -0.70550429],
       [ 1.13085826,  0.64500184,  0.10641374]])
>>> np.random.randn(3,3)
array([[ 0.42215483,  0.12420684, -0.83795346],
       [ 0.4090157 ,  0.10275122, -1.90772239],
       [ 1.1002243 , -1.40232506, -0.22508127]])
>>> np.random.randn(3,3)
array([[-1.33620597,  0.30372151, -0.72015884],
       [ 2.5449146 ,  1.31729112,  0.0726303 ],
       [-0.25610814,  0.13801041,  1.14723599]])
>>> np.random.randn(3,3)
array([[ 1.37626076, -0.47218397,  0.5240849 ],
       [ 1.48510793,  1.48243225,  0.72813051],
       [-0.38964808,  0.27889376,  0.0519002 ]])
>>> np.random.randn(3,3)
array([[-1.04474368, -0.16150753, -2.79353057],
       [ 0.36164801,  0.24010758,  0.47781228],
       [ 0.20885194,  0.91791163, -1.41177784]])
>>> np.random.randn(3,3)
array([[ 1.22423572, -0.54565772,  0.90674805],
       [-0.98261724, -0.63316189,  0.82552024],
       [-1.28449686, -0.34730878, -1.07776075]])
>>> np.random.randn(3,3)
array([[ 0.97257495,  0.41219116, -0.13208604],
       [-0.67335092,  1.22222171, -0.92641342],
       [ 1.4249938 , -0.12478993,  0.70549192]])
>>> np.random.randn(3,3)
array([[ 0.7195475 ,  0.14146394, -0.65444848],
       [-0.67210417,  0.87036849,  1.00324527],
       [-0.36644983,  1.12805168,  0.7925838 ]])
>>> np.random.randn(3,3)
array([[-1.75084239, -0.80886945, -1.41260181],
       [-0.8030097 ,  0.79828047,  1.75599622],
       [-0.0755042 ,  0.42017682,  0.15655606]])

Terminal 1:

>>> import numpy as np
>>> np.random.seed(2809)
>>> 
>>> print(np.random.randn(10))
[ 0.23747478  0.56083014 -1.1650379   0.12482746  0.04529476 -0.60137682
 -0.18559365  0.57410855 -0.53776209 -0.03972935]
>>>

Terminal 2:

>>> import numpy as np
>>> np.random.seed(2809)
>>> 
>>> print(np.random.randn(10))
[ 0.23747478  0.56083014 -1.1650379   0.12482746  0.04529476 -0.60137682
 -0.18559365  0.57410855 -0.53776209 -0.03972935]
>>>

In @ptrblck 's example, the np.random.seed method is called again which “resets” the generator and the same numbers are generated. The random numbers generated will be different every consecutive time it’s called once the seed is set, but the same sequence of numbers will be generated if you initialize it from the same seed from different instances.

1 Like

I see, thanks for the clarification I missed the 2nd call to np.random.seed() in @ptrblck example.

I suppose that works fine with scripts running on terminal, but maybe its problematic with interactive notebooks, I guess calling the seed() function from cell to cell will be quite annoying?

Why would you want to do that? Resetting the seed would break the assumption that you are “randomly” initializing the model, data etc.
If you want to reuse the same parameters, I would rather load the state_dict of another model instead of relying in the seed.