How to get deterministic behavior?

I am using:

cudnn.benchmark = False
cudnn.deterministic = True

random.seed(1)
numpy.random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed(1)

And still not getting deterministic behavior…

1 Like

How large are the differences?
Could you provide a code snippet showing the error?

If the absolute error is approx. 1e-6 it might be due to the usage of float values.

The diferences are meanifull… Not of the order of 1e-6…

The diferences are about 1% more or less…

The problem is in the data augmentation transformations:

transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),

After I removed the above code, it worked 100% deterministically…

I am using Pytorch 0.3.1

Does torchvision have a different seed from torch?

The following appears to be the same issue:

If I use num_workers=0, I can get back the augmentation without losing the deterministic behavior, exactly as reported in the link above.

This has been fixed in 0.4, provided you set random.seed in worker_init_fn.

Furthermore, you might want to set torch.backends.cudnn.deterministic=True

3 Likes

It worked. Thanks. Nevertheless, I think we should have a more straightforward way to get determinist behave. This “workaround” in the workers is not so intuitive for beginners.

1 Like

Based on my tests, even in PyTorch 0.4 we still need to initialize the workers with the same seed to get deterministic behavior. The following lines are NOT enough:

cudnn.benchmark = False
cudnn.deterministic = True

random.seed(1)
numpy.random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed(1)

I think this should not be the standard behavior. In my opinion, the above lines should be enough to provide deterministic behavior. It is not obvious to the novice that, besides the above lines, he also need to initialize the workers with the same seed to get deterministic behavior.

1 Like

The problem is numpy. We can’t assume that numpy exists so you’d have to set the seed for numpy in workers yourself.

IMO, worker_init_fn allows some flexibility, but why shouldn’t PyTorch’s workers have a reasonable default behavior? Something like the following code block, if it were to execute before worker_init_fn, would be backward compatible and would provide determinism out of the box, whether Numpy is installed or not.

try:
    import numpy
    torch_seed = torch.initial_seed()
    # Numpy expects unsigned integer seeds.
    np_seed = torch_seed // 2**32-1
    numpy.random.seed(np_seed)
except Exception:
    pass
1 Like

So does torch.backends.cudnn.deterministic=True
make any form of random data pre-processing using torch libraries deterministic?

Beware running 2 processes on the same machine, even if you set random seeds and set num_workers=0. In my experience running one process on it’s own is deterministic, but running 2 processes side by side is not. The consensus here is that static variables in dynamically linked libraries are to blame. If you need to compare, run on two different machines.
(by 2 processes I mean two training scripts for example)

1 Like

def set_seed(seed):
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
np.random.seed(seed)
random.seed(seed)

The final line solved this issue for me.

1 Like

I’ve read this thread closely and tried everything I can see on here as suggestions, and I still cannot get deterministic behavior during training. I’m using Pytorch 1.1.0 and Torchvision 0.3.0.

In my code I have the following:

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.manual_seed(1)
torch.cuda.manual_seed_all(1)
np.random.seed(1)
random.seed(1)

And my dataloader looks like this:

train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=128,
    num_workers=0,
    shuffle=True,
    pin_memory=True,
    worker_init_fn=random.seed(1)
)

Is there a place I’ve missed setting the seed? Looking at the first few mini-batches of training, the differences in accuracy across starts is pretty stark.

Also, I don’t even get deterministic behavior when I disable RandomResizedCrop and RandomHorizontalFlip in my image transforms – so the non-deterministic behavior is happening somewhere else.

Any clues or pointers would be appreciated.