Torch.cuda.get_device_name(0) error response

pytorch version: 1.12.0+cu102
cuda version: release 11.6, V11.6.124

torch.cuda.get_device_name(0)
/usr/local/lib/python3.8/dist-packages/torch/cuda/init.py:146: UserWarning:
NVIDIA A100-SXM4-80GB with CUDA capability sm_80 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA A100-SXM4-80GB GPU with PyTorch, please check the instructions at Start Locally | PyTorch

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
‘NVIDIA A100-SXM4-80GB’

torch.cuda.get_device_name(1)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(2)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(3)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(4)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(5)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(6)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(7)
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.8/dist-packages/torch/cuda/init.py”, line 329, in get_device_name
return get_device_properties(device).name
File “/usr/local/lib/python3.8/dist-packages/torch/cuda/init.py”, line 362, in get_device_properties
raise AssertionError("Invalid device i

You’ve installed a PyTorch binary with CUDA 10.2, while your A100 Ampere GPU needs CUDA>=11.
Install any of the latest binaries and it’ll work.

I update the PyTorch version had the same problem, giid:7 gpu loss

import torch
torch.cuda.device_count()
7
torch.cuda.get_device_name(0)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(1)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(2)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(3)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(4)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(5)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(6)
‘NVIDIA A100-SXM4-80GB’
torch.cuda.get_device_name(7)
Traceback (most recent call last):
File “”, line 1, in
File “/opt/conda/lib/python3.8/site-packages/torch/cuda/init.py”, line 329, in get_device_name
return get_device_properties(device).name
File “/opt/conda/lib/python3.8/site-packages/torch/cuda/init.py”, line 362, in get_device_properties
raise AssertionError(“Invalid device id”)
AssertionError: Invalid device id
torch.version
‘1.12.1+cu113’

Could you describe what device7 is according to nvidia-smi?

nvidia-smi can display device 7
±----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A100-SXM… On | 00000000:10:00.0 Off | 0 |
| N/A 39C P0 65W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 1 NVIDIA A100-SXM… On | 00000000:16:00.0 Off | 0 |
| N/A 36C P0 66W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 2 NVIDIA A100-SXM… On | 00000000:2F:00.0 Off | 0 |
| N/A 36C P0 63W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 3 NVIDIA A100-SXM… On | 00000000:33:00.0 Off | 0 |
| N/A 38C P0 69W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 4 NVIDIA A100-SXM… On | 00000000:C5:00.0 Off | 0 |
| N/A 37C P0 66W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 5 NVIDIA A100-SXM… On | 00000000:CA:00.0 Off | 0 |
| N/A 38C P0 65W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 6 NVIDIA A100-SXM… On | 00000000:E3:00.0 Off | 0 |
| N/A 36C P0 67W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+
| 7 NVIDIA A100-SXM… On | 00000000:E7:00.0 Off | 0 |
| N/A 40C P0 67W / 400W | 0MiB / 81920MiB | 0% Default |
| | | Disabled |
±------------------------------±---------------------±---------------------+

Are you able to run any code on cuda:7? If not, did you set CUDA_VISIBLE_DEVICES in your environment?