Does PyTorch change its internal seed during training?

BramVanroy · May 29, 2019, 6:35am

I am trying to make my training code as deterministic and reproducible as possible. When running the same training code multiple times, and always re-initialising the model, I get different results - even if I set the seeds manually, before all runs start. I found that when I reset the seed on every training run, all runs do end up with the same result. This seems to indicate that torch (or numpy or Python’s random) internally does change its seed.

def set_seed():
    torch.manual_seed(3)
    torch.cuda.manual_seed_all(3)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(3)
    random.seed(3)
    os.environ['PYTHONHASHSEED'] = str(3)

for i in range(10):
    set_seed()
    model = init_model()
    model.train()
    model.test()

The code above always produces the same result for the test set, as expected. But when I only set the seed once, i.e. outside the loop, results vary over the iterations. What causes this?

ybj14 · May 29, 2019, 7:15am

uh… random generation is a sequential process so every time you generate a new random number the seed changes. Thus before every training begins you need to reset the seed, i.e. put set_seed() in the loop.

BramVanroy · May 29, 2019, 8:02am

I’m not sure we understand each other. Let’s say that I put set_seed(5) outside the range loop, I would expect all ten runs to have the same result. This is not the case. When I put set_seed(5) inside the range loop, it does work as expected. Does that mean the seed changes over time?

I did find that manual seed returns a generator, which does explain this behaviour.

ptrblck · May 29, 2019, 10:51am

As @ybj14 said, the pseudo-random number generator uses the seed as its initial seed and generates all sequential numbers based on this initial seed.
That doesn’t mean that every “random” number will have the exact same value (which would create a useless random number generator), but that the sequence of random numbers are the same.
Have a look at this example:

torch.manual_seed(2809)
print(torch.randn(2))
print(torch.randn(2))
print(torch.randn(2))

torch.manual_seed(2809)
print(torch.randn(2))
print(torch.randn(2))
print(torch.randn(2))

As you can see, torch.randn will yield new random numbers if sequentially called. After resetting the seed, you’ll get the same sequence of random numbers.

In your case, the model initialization can be seen as a call to the random number generator, which will yield different results. If you want your model to have exactly the same parameters, set the seed before initialization or reload the state_dict of a reference model.

BramVanroy · May 29, 2019, 4:52pm

Thank you, that does make sense indeed!

Vincent24 · August 28, 2020, 2:42pm

Hello everybody!
I did as above and had the problem

Code:
def seed_torch(seed=0):
random.seed(seed)#
os.environ[‘PYTHONHASHSEED’] = str(seed)#
np.random.seed(seed)#
torch.manual_seed(seed)#
torch.cuda.manual_seed(seed)#
torch.cuda.manual_seed_all(seed)# # if you are using multi-GPU.
torch.backends.cudnn.benchmark = False#
torch.backends.cudnn.deterministic = True#

seed_torch()
net_basnet.train()
for i, data in enumerate(train_dataloader):
seed_torch()
…
…

seed_torch()
net_basnet.eval()
seed_torch()
with torch.no_grad():
for i, data in enumerate(val_dataloader):
seed_torch()
…

I have the same results of training loss. However, I have different results of validation loss, even putting seed_torch() everywhere in loops. Please help me solve this problem!!!. Thank you!

s_n · March 23, 2021, 10:09pm

Hi @ptrblck ,

“the pseudo-random number generator uses the seed as its initial seed and generates all sequential numbers based on this initial seed.”--------------------- Is it possible to print these sequential numbers in python somehow?

ptrblck · March 23, 2021, 10:19pm

You can print these values directly, e.g. via:

torch.manual_seed(2809)
for _ in range(10):
    print(torch.randn(1))

After re-seeding you would see the same values again.

s_n · March 23, 2021, 10:21pm

Actually, my code is pretty big and I can’t run a for loop like this. I just wanted to print the value of RNG at some particular point in my code.

ptrblck · March 24, 2021, 5:48am

I’m not sure I understand what “value of RNG” is.
The pseudorandom number generator can be seeded and will then output defined pseudorandom values.
If you want to check which value would be sampled in the 1000th iteration from torch.randn, you could just call this method.