Single Random Operation different across DDP

I am using DDP and working with stoachstic models. I wish to add noise as part of my forward pass. I have each process seeded properly, as I generally wish the randomness to be the same. However, for one, single operation, I wish each process would result in a different random outcome. I imagine something like this:

seed_everything(0)
a = torch.randn((1024,10), global_rank) # should return the exact same number across all GPUs
b = torch.randn((1024, 10), global_rank, decouple_ddp_randomness=True) # should return different numbers per GPU
c = torch.randn((1024,10), global_rank) # should return the exact same number across all GPUs

I dont want b to be the same across GPUs in DDP, because I am scared of it biasing my results.

Is that possible?

What I would do in this case is have all of the GPUs generate an int tensor of length world_size (the number of GPUs) and then call dist.scatter() to send the one that rank 0 (the main GPU) generated to the other ones, each one getting one int. And then each one can initialize a Generator and use the int it received as this Generator’s seed (generator.set_manual_seed()), and pass the generator to the random operation if it can be expressed as one of these that get a Generator argument. If it’s not one of these (e.g. you prefer to stick to torch.randn()) then see the Generator link for how to save a seed (the original one), switch to a new one (for one time for your operation) and then restore the original one.