Hi, all
I noticed an issue with my current setup. When I simply execute the following code snippet, all GPUs will be occupied. Specifically, cuda 0 is where the data is relocated to. The other GPUs have zero usage, but are still displayed for some reason.
import torch
a = torch.rand(10)
b = a.cuda()
Below is an attached GPU usage stats returned from nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... On | 00000000:04:00.0 Off | N/A |
| 22% 31C P2 68W / 250W | 519MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX TIT... On | 00000000:05:00.0 Off | N/A |
| 22% 27C P8 15W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX TIT... On | 00000000:08:00.0 Off | N/A |
| 22% 28C P8 15W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX TIT... On | 00000000:09:00.0 Off | N/A |
| 22% 26C P8 15W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 GeForce GTX TIT... On | 00000000:85:00.0 Off | N/A |
| 22% 27C P8 14W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 GeForce GTX TIT... On | 00000000:86:00.0 Off | N/A |
| 22% 27C P8 15W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 GeForce GTX TIT... On | 00000000:89:00.0 Off | N/A |
| 22% 24C P8 15W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 GeForce GTX TIT... On | 00000000:8A:00.0 Off | N/A |
| 22% 27C P8 15W / 250W | 4MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 32155 C python 514MiB |
| 1 N/A N/A 32155 C python 0MiB |
| 2 N/A N/A 32155 C python 0MiB |
| 3 N/A N/A 32155 C python 0MiB |
| 4 N/A N/A 32155 C python 0MiB |
| 5 N/A N/A 32155 C python 0MiB |
| 6 N/A N/A 32155 C python 0MiB |
| 7 N/A N/A 32155 C python 0MiB |
+-----------------------------------------------------------------------------+
Other information:
OS: Ubuntu 20.04.1 LTS
PyTorch: 1.7.1 (other versions seem to have the issue too)
CUDA version: 11.1
Driver Version: 455.32.00
Hardware: 8x GeForce GTX TITAN X
Could someone please let me know what is going on?
Many thanks,
Fred