Reproducibility

Dicko · January 6, 2021, 11:37am

Hi folks, I am struggling with getting repeatable result for a DETR model I am running with pytorch.
I have defined the following function:
import os
from numpy import random
import numpy as np
def seed_torch(seed=42):
random.seed(seed)
os.environ[‘PYTHONHASHSEED’] = str(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
torch.backends.cudnn.benchmark = True
torch.backends.cudnn.deterministic = True
Please note that I have also tried this function with cudnn_benchmark set to false.
This is the folder structure for DETR and the py files contained within it.

I have tried using the seed function in each file separately to see if anyone of them would make my results reproducible and this did not work.
I then tried using the seed function in all files in each folder separately and this did not work.
I tried using the seed function in the first three folders and the first epoch was repeatable but after the first epoch the numbers were different.
The numbers I am talking about look like this:

But I am comparing losses for now.
In the end I copied the seed function to every py file and then used the function in every class and definition in the py files and the results are still not reproducible.
Does anybody have any idea how to resolve this?
I am running my code in the Anaconda terminal.

googlebot · January 6, 2021, 4:40pm

try enabling torch.set_deterministic — PyTorch master documentation, and read its doc

it may well fall with runtime errors, as some operations don’t have built-in deterministic implementations (on gpu); sometimes only ops used in backward() diverge, then only the first batch has reproducible outputs. if whole first epoch is reproducible, maybe problem is with dataloader’s workers.

Dicko · January 7, 2021, 3:48pm

Thank you for your reply but I have went through each py file and underneath every line of code I have used the seed function . I still do not get reproducible results. All files have been seeded, every line seeded and still not reproducible.

googlebot · January 7, 2021, 4:03pm

your problem is likely not related to random number generation, so setting seeds makes no sense. more likely culplit is: