How to get deterministic behavior with DistributedDataParallel?

Hello, my code has deterministic behavior without DistributedDataParallel, however, not deterministic with DistributedDataParallel.

My code for deterministic behavior is:

cudnn.benchmark = False
cudnn.deterministic = True
torch.cuda.manual_seed_all(123)…, worker_init_fn=random.seed)

And my launch command:
python - torch.distributed.launch
–master_ports=$((RANDOM + 10000))

Does the DistributedDataParallel need more tricks to get deterministic behavior?

DistributedDataParallel should be deterministic. All it does is applying allreduce to sync gradients across processes. Can you check if the data loader is providing deterministic inputs for you?