Setting visible devices with Distributed Data Parallel

Hey @Diego, the launching script will launch multiple sub-processes, which might be inherit the CUDA_VISIBLE_DEVICES value you passed to the command line. A work around would be setting CUDA_VISIBLE_DEVICES in main.py before loading any cuda-related packages. Note that the recommended way to use DDP is one-process-per-device, i.e., each process should exclusively run on one GPU. If you want this, you need to set CUDA_VISIBLE_DEVICES to a different value for each subprocess.

BTW, what’s the default CUDA_VISIBLE_DEVICES value in your machine? I would assume the script should be able to see all devices by default if CUDA_VISIBLE_DEVICES wasn’t set. And when the program throws RuntimeError: CUDA error: invalid device ordinal, do you know which device it tries to access?