Hello,
I try to use multiple GPUs (RTX 2080Ti *2) with torch.distributed and pytorch-lightning on WSL2 (windows subsystem for linux).
But I receiving following error:
NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:518, unhandled system error, NCCL version 2.4.8
I can run same process using single GPU.
Thanks for your great help.
ebarsoum
(Emad Barsoum)
August 25, 2020, 6:29am
2
Is SLI enabled? What the output of nvidia-smi?
Thank you for your reply.
Since I use CUDA on WSL (https://docs.nvidia.com/cuda/wsl-user-guide/index.html#getting-started ), Ubuntu (on WSL) cannot use the “nvidia-smi” utility.
nvidia-smi on windows shell output following:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.41 Driver Version: 455.41 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... WDDM | 00000000:1A:00.0 On | N/A |
| 41% 50C P0 59W / 250W | 662MiB / 11264MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... WDDM | 00000000:68:00.0 Off | N/A |
| 27% 35C P8 20W / 250W | 216MiB / 11264MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
Does this output mean SLI is enabled?
phamvanvung
(Pham Van Vung)
September 30, 2020, 12:25am
4
I have the same question, it would be great if we could have the answer for this?
@Naruki_Ichihara did you ever figure this out?