Do I need separate scripts for distributed training and sinlge GPU?

maxmatical · December 9, 2019, 3:47pm

I’ve been trying to create a training script that can use both a single GPU or multiple GPUs for distributed training by setting nproc_per_node to be equal to the number of GPUs being used. However, if I try nproc_per_node=1, I get a runtime error RuntimeError: Default process group has not been initialized, please make sure to call init_process_group. So do I need separate scripts for either scenarios, or is there a way to set up the script in such a way that it can do both?