How to get deterministic behavior?

The problem is in the data augmentation transformations:

transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),

After I removed the above code, it worked 100% deterministically…

I am using Pytorch 0.3.1

Does torchvision have a different seed from torch?

1 Like

The following appears to be the same issue:

If I use num_workers=0, I can get back the augmentation without losing the deterministic behavior, exactly as reported in the link above.

This has been fixed in 0.4, provided you set random.seed in worker_init_fn.

Furthermore, you might want to set torch.backends.cudnn.deterministic=True

3 Likes

It worked. Thanks. Nevertheless, I think we should have a more straightforward way to get determinist behave. This “workaround” in the workers is not so intuitive for beginners.

2 Likes

Based on my tests, even in PyTorch 0.4 we still need to initialize the workers with the same seed to get deterministic behavior. The following lines are NOT enough:

cudnn.benchmark = False
cudnn.deterministic = True

random.seed(1)
numpy.random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed(1)

I think this should not be the standard behavior. In my opinion, the above lines should be enough to provide deterministic behavior. It is not obvious to the novice that, besides the above lines, he also need to initialize the workers with the same seed to get deterministic behavior.

1 Like

The problem is numpy. We can’t assume that numpy exists so you’d have to set the seed for numpy in workers yourself.

IMO, worker_init_fn allows some flexibility, but why shouldn’t PyTorch’s workers have a reasonable default behavior? Something like the following code block, if it were to execute before worker_init_fn, would be backward compatible and would provide determinism out of the box, whether Numpy is installed or not.

try:
    import numpy
    torch_seed = torch.initial_seed()
    # Numpy expects unsigned integer seeds.
    np_seed = torch_seed // 2**32-1
    numpy.random.seed(np_seed)
except Exception:
    pass
1 Like

So does torch.backends.cudnn.deterministic=True
make any form of random data pre-processing using torch libraries deterministic?

Beware running 2 processes on the same machine, even if you set random seeds and set num_workers=0. In my experience running one process on it’s own is deterministic, but running 2 processes side by side is not. The consensus here is that static variables in dynamically linked libraries are to blame. If you need to compare, run on two different machines.
(by 2 processes I mean two training scripts for example)

1 Like

def set_seed(seed):
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
np.random.seed(seed)
random.seed(seed)

The final line solved this issue for me.

2 Likes

I’ve read this thread closely and tried everything I can see on here as suggestions, and I still cannot get deterministic behavior during training. I’m using Pytorch 1.1.0 and Torchvision 0.3.0.

In my code I have the following:

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
torch.manual_seed(1)
torch.cuda.manual_seed_all(1)
np.random.seed(1)
random.seed(1)

And my dataloader looks like this:

train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=128,
    num_workers=0,
    shuffle=True,
    pin_memory=True,
    worker_init_fn=random.seed(1)
)

Is there a place I’ve missed setting the seed? Looking at the first few mini-batches of training, the differences in accuracy across starts is pretty stark.

Also, I don’t even get deterministic behavior when I disable RandomResizedCrop and RandomHorizontalFlip in my image transforms – so the non-deterministic behavior is happening somewhere else.

Any clues or pointers would be appreciated.

Based on previous answers, and official docs, what I did works for me.

def set_seed(seed):
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    np.random.seed(seed)
    random.seed(seed)

set_seed(0) # 0 or any number you want for the seed

in the DataLoader(), I also set the num_workers = 0

This setting makes my results reproducible.
I am using latest Pytorch version is ‘1.5.0’.

2 Likes

Will this work for Pytorch version 1.0.1 as well?

Not sure about 1.0.1. I suppose if the same options are available, it may work as well. You can try.

Well I added torch.backend.cudnn.enabled = False and it worked for me. I may have to recheck the results with it to torch.backend.cudnn.enabled = True.

Hi @dstanner, did you solve it? This doesn’t works for me too.

I’m not sure why… but this does not work for me…

I never got this to work, no. I’ve been doing this on Windows, though. Not sure if that makes a difference. Some PyTorch functionality isn’t optimized for Windows I’ve found.

Hi, I have used exactly the same code but still my code non-deterministic. I am not able to reproduce the results. Tried in both pytorch 1.4.0 and 1.3.1

Thanks.

Hello, In my case It was about using the Parallel from joblib to cache the dataset. Replacing Parallel with for loop helped