Could I use torch.cuda.manual_seed() with Multiple GPUs to to keep the deterministic?

Hi experts, I am training the Vision model with multiple GPUs (8 GPUs), And I lunched the job with "mpirun --npernode 8 ", which means I use 8 process for 8 GPUs (1 process each GPUs), So I currently initialize with this code:

if torch.cuda.is_available():
            self.seed = int(self.seed)
            random.seed(self.seed)
            np.random.seed(self.seed)
            torch.manual_seed(self.seed)

            # ddp: only set seed on GPU associated with this process
            torch.cuda.manual_seed(self.seed)

            torch.backends.cudnn.deterministic = True
            torch.backends.cudnn.benchmark = False
  1. To keep the deterministic, Must I use torch.cuda.manual_seed_all() ? It seems torch.cuda.manual_seed() with same seed for all 8 process will also keep the deterministic for my job right?

  2. Similar to the question above. I save the checkpoint (including the random_state blow) of above job

              random_state = {'random': random.getstate(),
                                'numpy_random': np.random.get_state(),
                                'torch_random': torch.get_rng_state(),
                                'torch_cuda_random': torch.cuda.get_rng_state(device=self.opt['device']) if self.opt['CUDA'] else None
                                }

and try to load it by:

        random.setstate(random_state['random'])
        np.random.set_state(random_state['numpy_random'])
        torch.set_rng_state(random_state['torch_random'])
        if self.opt['CUDA']:
            torch.cuda.set_rng_state(random_state['torch_cuda_random'], device=self.opt['device'])

Should I used

torch.cuda.get_rng_state_all() 
torch.cuda.set_rng_state_all () 

instead if

torch.cuda.get_rng_state() 
torch.cuda.set_rng_state() 

to keep deterministic ?

  1. What is your suggestion ( #a or #b below ) is better? :
    #a. torch.cuda.manual_seed() + torch.cuda.get_rng_state() + torch.cuda.set_rng_state()
    #b. torch.cuda.manual_seed_all() + torch.cuda.get_rng_state_all() + torch.cuda.set_rng_state_all()

torch.manual_seed should already seed the CPU and GPU(s) as described in the Reproducibility docs.
Since you are using a single process per GPU, torch.cuda.manual_seed_all (or the other _all calls) should not be necessary.

1 Like