I am running an official docker (holoscan) container on the nvidia clara with ubuntu 20.04LTS.
- I installed PyTorch with:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
- Then I tried running my code which returns
/home/holoscan/.local/lib/python3.10/site-packages/torch/cuda/__init__.py:235: UserWarning:
Quadro RTX 6000 with CUDA capability sm_75 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_80 sm_86 sm_89 sm_90 sm_90a.
If you want to use the Quadro RTX 6000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
nvidia-smi
Fri Feb 21 10:01:21 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.6 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro RTX 6000 Off | 00000000:09:00.0 On | Off |
| 33% 34C P5 38W / 260W | 795MiB / 24576MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:44:37_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0
python
Python 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.get_arch_list())
['sm_50', 'sm_80', 'sm_86', 'sm_89', 'sm_90', 'sm_90a']
Has the sm_75 support been removed?
-
I tried the same with an earlier version:
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
, which returns the same arguments:print(torch.cuda.get_arch_list()) ['sm_50', 'sm_80', 'sm_86', 'sm_89', 'sm_90', 'sm_90a']
-
I also tried another version:
pip install torch==2.0.1 --index-url https://download.pytorch.org/whl/cu118
, which gives me this error when trying to run my code:[error] [gxf_wrapper.cpp:100] Exception occurred for operator: 'optimizeData' - AssertionError: Torch not compiled with CUDA enabled At: /home/holoscan/.local/lib/python3.10/site-packages/torch/cuda/__init__.py(239): _lazy_init /workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/optimisation_helicoid.py(395): optimize_single_b_torch_np /workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/optimisation_helicoid.py(431): helicoid_optimisation_ti_parallel_torch_np /workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/biopsy_application.py(362): compute [error] [entity_executor.cpp:596] Failed to tick codelet optimizeData in entity: optimizeData code: GXF_FAILURE [warning] [greedy_scheduler.cpp:243] Error while executing entity 53 named 'optimizeData': GXF_FAILURE [info] [greedy_scheduler.cpp:401] Scheduler finished. [error] [program.cpp:580] wait failed. Deactivating... [error] [runtime.cpp:1649] Graph wait failed with error: GXF_FAILURE [warning] [gxf_executor.cpp:2241] GXF call GxfGraphWait(context) in line 2241 of file /workspace/holoscan-sdk/src/core/executors/gxf/gxf_executor.cpp failed with 'GXF_FAILURE' (1) [info] [gxf_executor.cpp:2251] Graph execution finished. [error] [gxf_executor.cpp:2259] Graph execution error: GXF_FAILURE Traceback (most recent call last): File "/workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/biopsy_application.py", line 527, in <module> main() File "/workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/biopsy_application.py", line 520, in main app.run() File "/workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/biopsy_application.py", line 362, in compute errors, coef_list, scattering_params, errors_scatter = helicoid_optimisation_ti_parallel_torch_np(t1, b, M, x) File "/workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/optimisation_helicoid.py", line 431, in helicoid_optimisation_ti_parallel_torch_np results = optimize_single_b_torch_np(range(b.shape[0]), b, [b_t1]*b.shape[0], [a_t1]*b.shape[0], [M]*b.shape[0], [x]*b.shape[0], [current_x]*b.shape[0], [left_bound]*b.shape[0], [right_bound]*b.shape[0]) File "/workspace/volumes/data/ba_hyperspectral_segmentation/move_to_holohub/holohub/applications/biopsy_app_cupy/optimisation_helicoid.py", line 395, in optimize_single_b_torch_np b_i = torch.as_tensor(b_i, device='cuda') File "/home/holoscan/.local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
torch.cuda.get_arch_list() returns an empty list:
>>> import torch print(torch.>>> print(torch.cuda.get_arch_list()) []
Which PyTorch version do I have to use, and if it is an older one doesn’t this reduce performance?
This might be related to PyTorch 1.6 - "Tesla T4 with CUDA capability sm_75 is not compatible" but the used versions are a lot older than the ones I want to use.