Impact of data shuffling on results reproducibility in Pytorch

Pytorch dataloader class has the following constructor:

DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
           batch_sampler=None, num_workers=0, collate_fn=None,
           pin_memory=False, drop_last=False, timeout=0,
           worker_init_fn=None)

When shuffle is set to True , data is reshuffled at every epoch. Shuffling the order in which examples are fed to the classifier is helpful so that batches between epochs do not look alike. Doing so will eventually make our model more robust.

However, I’m unable to understand by setting shuffle=True, can we get the same accuracy value for different runs ?

@ptrblck
by setting shuffle=True in data loaders, do we get the same accuracy value for different runs?

You shouldn’t expect the same bitwise accurate results (and would need to set all seeds, use cudnn.deterministic etc. as described here).

However, if your training procedure is “stable” you should see a small stddev in the final accuracy.

If I use following code to set all the seeds,

random.seed(seed)
os.environ[‘PYTHONHASHSEED’] = str(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.enabled = False

will shuffling the instances, still result in different values in different runs? @ptrblck

@ptrblck

I’m unable to understand
a) what is a stable training procedure? and b) how to achieve it?

If setting all seeds doesn’t help, you would have to check, if your model uses one of the mentioned methods from the linked doc.

By “stable” I meant that your model would converge 10 out of 10 runs with a small delta in the final loss and accuracy for different seeds.
“Unstable” training would depend on the seed as a hyperparameter.

1 Like