I have four GPUs, and I am using nn.DataParallel
and passing the all ids of four GPUs to it. However, the first GPU only gets full(32 out of 32) and others are still empty (7 out of 32). Is that because the first GPU acts as a master and aggerates the gradients? or that’s not the usual? I attached a snapshot of nvidia-smi
:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:18:00.0 Off | 0 |
| N/A 44C P0 63W / 300W | 32134MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:3B:00.0 Off | 0 |
| N/A 35C P0 56W / 300W | 6656MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:86:00.0 Off | 0 |
| N/A 35C P0 55W / 300W | 6524MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... On | 00000000:AF:00.0 Off | 0 |
| N/A 38C P0 55W / 300W | 7052MiB / 32768MiB | 0% Default |
| | | N/A |