Is it possible make CUDA_VISIBLE_DEVICES and DDP work together?
I am trying to run a script on an 8 GPU server like so:
CUDA_VISIBLE_DEVICES=0,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=7 --use_env main.py
but I always run into:
RuntimeError: CUDA error: invalid device ordinal
Here is the output of nvidiia-smi
:
ue Aug 18 15:21:16 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... On | 00000000:04:00.0 Off | N/A |
| 20% 13C P8 7W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... On | 00000000:05:00.0 Off | N/A |
| 23% 18C P8 8W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... On | 00000000:08:00.0 Off | N/A |
| 23% 20C P8 8W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... On | 00000000:09:00.0 Off | N/A |
| 23% 23C P8 8W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 GeForce GTX 108... On | 00000000:84:00.0 Off | N/A |
| 23% 18C P8 11W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 GeForce GTX 108... On | 00000000:85:00.0 Off | N/A |
| 20% 16C P8 7W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 GeForce GTX 108... On | 00000000:88:00.0 Off | N/A |
| 20% 15C P8 7W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 GeForce GTX 108... On | 00000000:89:00.0 Off | N/A |
| 23% 25C P8 7W / 235W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
What am I missing?