Specify which GPUs to use with torch.distributed.launch

Hi all,

is there a way to specify a list of GPUs that should be used from a node?

The documentation only shows how to specify the number of GPUs to use:

python -m torch.distributed.launch --nproc_per_node=NUM_GPUS_YOU_HAVE ...

This was already asked in this thread but not answered.


You can make certain GPUs visible via CUDA_VISIBLE_DEVICES=1,3,7 python -m ..., which would map GPU1, GPU3, and GPU7 to cuda:0, cuda:1, cuda:2 inside the script and execute the workload (DDP in your case) only on these devices.