Cannot get Pytorch to work on HPC Node

Hello Everyone,

Sorry in advance if this is a repeat post, but i looked everywhere and tried my best for weeks and still cannot figure this out.

I will try to include as much information as possible.

I am a student and i am trying to install Deeplabcut for a project. It uses torch, cudnn, and cuda. My main issue is that i can not reinstall the Nvidia GPU drivers, as they already have them preinstalled and i cannot alter them.

I am using an interactive bash on my HPC using slurm command:

srun --nodes=1 --ntasks=1 --cpus-per-task=4 --mem=16GB --time=04:00:00 --gres=gpu:1 --nodelist=cn14 --pty bash

when i run nvidia smi i get:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P100-PCIE-16GB            Off| 00000000:08:00.0 Off |                    0 |
| N/A   24C    P0               26W / 250W|      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE-16GB            Off| 00000000:84:00.0 Off |                    0 |
| N/A   27C    P0               27W / 250W|      0MiB / 16384MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

which nvidia-smi:

(DEEPLABCUT) [fmmachta@cn14 bin]$ which nvidia-smi
/usr/bin/nvidia-smi
(DEEPLABCUT) [fmmachta@cn14 bin]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
(DEEPLABCUT) [fmmachta@cn14 bin]$ which nvcc
/data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/bin/nvcc

then i run ipython to test my torch cude integration:

In [1]: import torch
In [2]: torch.cuda.is_available()
Out[2]: False

Here is my conda list:

(DEEPLABCUT) fmmachta@commander /apps/local/cuda/bin $ conda list
# packages in environment at /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
albumentations            1.4.3                    pypi_0    pypi
anyio                     4.4.0              pyhd8ed1ab_0    conda-forge
aom                       3.9.1                hac33072_0    conda-forge
argon2-cffi               23.1.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py310h2372a71_4    conda-forge
arrow                     1.3.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
beautifulsoup4            4.12.3             pyha770c72_0    conda-forge
blas                      1.0                    openblas  
bleach                    6.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.6               hef167b5_0    conda-forge
brotli-python             1.1.0           py310hc6cd4ac_1    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.32.3               h4bc722e_0    conda-forge
c-blosc2                  2.12.0               hb4ffafa_0    conda-forge
ca-certificates           2024.7.4             hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cairo                     1.18.0               hebfffa5_3    conda-forge
cccl                      2.3.2                h2c7f797_0  
certifi                   2024.7.4        py310h06a4308_0  
cffi                      1.16.0          py310h2fee648_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
click                     8.1.7                    pypi_0    pypi
comm                      0.2.2              pyhd8ed1ab_0    conda-forge
contourpy                 1.2.1                    pypi_0    pypi
cuda-cccl                 12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-command-line-tools   12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-compiler             12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-cudart               12.1.105                      0    nvidia
cuda-cudart-dev           12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-cudart-static        12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-cuobjdump            12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-cupti                12.1.105                      0    nvidia
cuda-cupti-static         12.1.62                       0    nvidia/label/cuda-12.1.0
cuda-cuxxfilt             12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-documentation        12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-driver-dev           12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-gdb                  12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-libraries            12.1.0                        0    nvidia
cuda-libraries-dev        12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-libraries-static     12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-nsight               12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nsight-compute       12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-nvcc                 12.1.66                       0    nvidia/label/cuda-12.1.0
cuda-nvdisasm             12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvml-dev             12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvprof               12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvprune              12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvrtc                12.1.105                      0    nvidia
cuda-nvrtc-dev            12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvrtc-static         12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvtx                 12.1.105                      0    nvidia
cuda-nvvp                 12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-opencl               12.5.39                       0    nvidia
cuda-opencl-dev           12.1.56                       0    nvidia/label/cuda-12.1.0
cuda-profiler-api         12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-runtime              12.1.0                        0    nvidia
cuda-sanitizer-api        12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-toolkit              12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-tools                12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-version              12.5                          3    nvidia
cuda-visual-tools         12.1.0                        0    nvidia/label/cuda-12.1.0
cycler                    0.12.1                   pypi_0    pypi
dav1d                     1.2.1                hd590300_0    conda-forge
debugpy                   1.8.2           py310h76e45a6_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
deeplabcut                3.0.0rc3                 pypi_0    pypi
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
dlclibrary                0.0.6                    pypi_0    pypi
docker-pycreds            0.4.0                    pypi_0    pypi
einops                    0.8.0                    pypi_0    pypi
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h59595ed_0    conda-forge
ffmpeg                    7.0.1           gpl_h9be9148_104    conda-forge
filelock                  3.15.4                   pypi_0    pypi
filterpy                  1.4.5                    pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_2    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.53.1                   pypi_0    pypi
fqdn                      1.5.1              pyhd8ed1ab_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
fsspec                    2024.6.1                 pypi_0    pypi
gds-tools                 1.6.0.25                      0    nvidia/label/cuda-12.1.0
gettext                   0.22.5               h59595ed_2    conda-forge
gettext-tools             0.22.5               h59595ed_2    conda-forge
gitdb                     4.0.11                   pypi_0    pypi
gitpython                 3.1.43                   pypi_0    pypi
gmp                       6.3.0                hac33072_2    conda-forge
gmpy2                     2.1.2           py310heeb90bb_0  
gnutls                    3.7.9                hb077bed_0    conda-forge
graphite2                 1.3.13            h59595ed_1003    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
harfbuzz                  9.0.0                hda332d3_1    conda-forge
hdf5                      1.14.3          nompi_hdf9ad27_105    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
huggingface-hub           0.24.2                   pypi_0    pypi
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       75.1                 he02047a_0    conda-forge
idna                      3.7                pyhd8ed1ab_0    conda-forge
imageio                   2.34.2                   pypi_0    pypi
imageio-ffmpeg            0.5.1                    pypi_0    pypi
imgaug                    0.4.0                    pypi_0    pypi
importlib_resources       6.4.0              pyhd8ed1ab_0    conda-forge
ipykernel                 6.29.5             pyh3099207_0    conda-forge
ipython                   8.26.0             pyh707e725_0    conda-forge
ipython_genutils          0.2.0              pyhd8ed1ab_1    conda-forge
ipywidgets                8.1.3              pyhd8ed1ab_0    conda-forge
isoduration               20.11.0            pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
joblib                    1.4.2                    pypi_0    pypi
jsonpointer               3.0.0           py310hff52083_0    conda-forge
jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
jupyter                   1.0.0             pyhd8ed1ab_10    conda-forge
jupyter_client            7.4.9              pyhd8ed1ab_0    conda-forge
jupyter_console           6.6.3              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2           py310hff52083_0    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_widgets        3.0.11             pyhd8ed1ab_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5                    pypi_0    pypi
krb5                      1.21.3               h659f571_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lazy-loader               0.4                      pypi_0    pypi
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 hf3520f5_7    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20240116.2      cxx17_he02047a_1    conda-forge
libaec                    1.1.3                h59595ed_0    conda-forge
libasprintf               0.22.5               h661eb56_2    conda-forge
libasprintf-devel         0.22.5               h661eb56_2    conda-forge
libass                    0.17.1               h39113c1_2    conda-forge
libblas                   3.9.0           23_linux64_openblas    conda-forge
libcblas                  3.9.0           23_linux64_openblas    conda-forge
libcublas                 12.1.0.26                     0    nvidia
libcublas-dev             12.1.0.26                     0    nvidia/label/cuda-12.1.0
libcublas-static          12.1.0.26                     0    nvidia/label/cuda-12.1.0
libcufft                  11.0.2.4                      0    nvidia
libcufft-dev              11.0.2.4                      0    nvidia/label/cuda-12.1.0
libcufft-static           11.0.2.4                      0    nvidia/label/cuda-12.1.0
libcufile                 1.10.1.7                      0    nvidia
libcufile-dev             1.6.0.25                      0    nvidia/label/cuda-12.1.0
libcufile-static          1.6.0.25                      0    nvidia/label/cuda-12.1.0
libcurand                 10.3.6.82                     0    nvidia
libcurand-dev             10.3.2.56                     0    nvidia/label/cuda-12.1.0
libcurand-static          10.3.2.56                     0    nvidia/label/cuda-12.1.0
libcurl                   8.9.0                hdb1bdb2_0    conda-forge
libcusolver               11.4.4.55                     0    nvidia
libcusolver-dev           11.4.4.55                     0    nvidia/label/cuda-12.1.0
libcusolver-static        11.4.4.55                     0    nvidia/label/cuda-12.1.0
libcusparse               12.0.2.55                     0    nvidia
libcusparse-dev           12.0.2.55                     0    nvidia/label/cuda-12.1.0
libcusparse-static        12.0.2.55                     0    nvidia/label/cuda-12.1.0
libdeflate                1.20                 hd590300_0    conda-forge
libdrm                    2.4.122              h4ab18f5_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgettextpo              0.22.5               h59595ed_2    conda-forge
libgettextpo-devel        0.22.5               h59595ed_2    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libglib                   2.80.3               h8a4344b_1    conda-forge
libhwloc                  2.11.1          default_hecaa2ac_1000    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libidn2                   2.3.7                hd590300_0    conda-forge
libjpeg-turbo             3.0.3                h5eee18b_0  
liblapack                 3.9.0           23_linux64_openblas    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnpp                    12.0.2.50                     0    nvidia
libnpp-dev                12.0.2.50                     0    nvidia/label/cuda-12.1.0
libnpp-static             12.0.2.50                     0    nvidia/label/cuda-12.1.0
libnsl                    2.0.1                hd590300_0    conda-forge
libnvjitlink              12.1.105                      0    nvidia
libnvjitlink-dev          12.1.55                       0    nvidia/label/cuda-12.1.0
libnvjpeg                 12.1.1.14                     0    nvidia
libnvjpeg-dev             12.1.0.39                     0    nvidia/label/cuda-12.1.0
libnvjpeg-static          12.1.0.39                     0    nvidia/label/cuda-12.1.0
libnvvm-samples           12.1.55                       0    nvidia/label/cuda-12.1.0
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libopenvino               2024.2.0             h2da1b83_1    conda-forge
libopenvino-auto-batch-plugin 2024.2.0             hb045406_1    conda-forge
libopenvino-auto-plugin   2024.2.0             hb045406_1    conda-forge
libopenvino-hetero-plugin 2024.2.0             h5c03a75_1    conda-forge
libopenvino-intel-cpu-plugin 2024.2.0             h2da1b83_1    conda-forge
libopenvino-intel-gpu-plugin 2024.2.0             h2da1b83_1    conda-forge
libopenvino-intel-npu-plugin 2024.2.0             he02047a_1    conda-forge
libopenvino-ir-frontend   2024.2.0             h5c03a75_1    conda-forge
libopenvino-onnx-frontend 2024.2.0             h07e8aee_1    conda-forge
libopenvino-paddle-frontend 2024.2.0             h07e8aee_1    conda-forge
libopenvino-pytorch-frontend 2024.2.0             he02047a_1    conda-forge
libopenvino-tensorflow-frontend 2024.2.0             h39126c6_1    conda-forge
libopenvino-tensorflow-lite-frontend 2024.2.0             he02047a_1    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpciaccess              0.18                 hd590300_0    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libprotobuf               4.25.3               h08a7969_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libtasn1                  4.19.0               h166bdaf_0    conda-forge
libtiff                   4.6.0                h1dd3fc0_3    conda-forge
libtorch                  2.3.1           cpu_mkl_h0bb0d08_100    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libuv                     1.48.0               h5eee18b_0  
libva                     2.22.0               hb711507_0    conda-forge
libvpx                    1.14.1               hac33072_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.16                 hd590300_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.7               he7c6b58_4    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
llvm-openmp               18.1.8               hf5423f3_0    conda-forge
llvmlite                  0.43.0                   pypi_0    pypi
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              hd590300_1001    conda-forge
markupsafe                2.1.5           py310h2372a71_0    conda-forge
matplotlib                3.8.4                    pypi_0    pypi
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mistune                   3.0.2              pyhd8ed1ab_0    conda-forge
mkl                       2023.2.0         h84fe81f_50496    conda-forge
mpc                       1.1.0                h10f8cd9_1  
mpfr                      4.0.2                hb69a4c5_1  
mpmath                    1.3.0           py310h06a4308_0  
nb_conda                  2.2.1                    unix_7    conda-forge
nb_conda_kernels          2.5.1              pyh707e725_2    conda-forge
nbclassic                 1.1.0              pyhd8ed1ab_0    conda-forge
nbclient                  0.10.0             pyhd8ed1ab_0    conda-forge
nbconvert                 7.16.4               hd8ed1ab_1    conda-forge
nbconvert-core            7.16.4             pyhd8ed1ab_1    conda-forge
nbconvert-pandoc          7.16.4               hd8ed1ab_1    conda-forge
nbformat                  5.10.4             pyhd8ed1ab_0    conda-forge
ncurses                   6.5                  h59595ed_0    conda-forge
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
nettle                    3.9.1                h7ab15ed_0    conda-forge
networkx                  3.3             py310h06a4308_0  
nomkl                     3.0                           0  
notebook                  6.5.7              pyha770c72_0    conda-forge
notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge
nsight-compute            2023.1.0.15                   0    nvidia/label/cuda-12.1.0
numba                     0.60.0                   pypi_0    pypi
numexpr                   2.10.0          py310h3ea09b0_100    conda-forge
numpy                     1.26.4          py310hb13e2d6_0    conda-forge
nvidia-cublas-cu12        12.5.3.2                 pypi_0    pypi
nvidia-cudnn-cu12         9.2.1.18                 pypi_0    pypi
ocl-icd                   2.3.2                hd590300_1    conda-forge
opencv-python             4.10.0.84                pypi_0    pypi
opencv-python-headless    4.10.0.84                pypi_0    pypi
openh264                  2.4.1                h59595ed_0    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.1                h4bc722e_2    conda-forge
overrides                 7.7.0              pyhd8ed1ab_0    conda-forge
p11-kit                   0.24.1               hc5aa10d_0    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2                    pypi_0    pypi
pandoc                    3.2.1                ha770c72_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
patsy                     0.5.6                    pypi_0    pypi
pcre2                     10.44                h0f59acf_0    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    10.4.0          py310hebfe307_0    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pixman                    0.43.2               h59595ed_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.2.2              pyhd8ed1ab_0    conda-forge
prometheus_client         0.20.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.47             pyha770c72_0    conda-forge
prompt_toolkit            3.0.47               hd8ed1ab_0    conda-forge
protobuf                  5.27.2                   pypi_0    pypi
psutil                    6.0.0           py310hc51659f_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pugixml                   1.14                 h59595ed_0    conda-forge
pure_eval                 0.2.3              pyhd8ed1ab_0    conda-forge
py-cpuinfo                9.0.0              pyhd8ed1ab_0    conda-forge
pycocotools               2.0.8                    pypi_0    pypi
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.2                    pypi_0    pypi
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytables                  3.8.0           py310h374b01c_4    conda-forge
python                    3.10.14         hd12c33a_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.20.0             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pytorch                   2.3.1           cpu_mkl_py310h75865b9_100    conda-forge
pytorch-cuda              12.1                 ha16c6d3_5    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2024.1                   pypi_0    pypi
pyyaml                    6.0.1           py310h2372a71_1    conda-forge
pyzmq                     26.0.3          py310h6883aea_0    conda-forge
qtconsole-base            5.5.2              pyha770c72_0    conda-forge
qtpy                      2.4.1              pyhd8ed1ab_0    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1             pyhd8ed1ab_0    conda-forge
requests                  2.32.3             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
rpds-py                   0.19.1          py310h42e942d_0    conda-forge
ruamel-yaml               0.18.6                   pypi_0    pypi
ruamel-yaml-clib          0.2.8                    pypi_0    pypi
safetensors               0.4.3                    pypi_0    pypi
scikit-image              0.24.0                   pypi_0    pypi
scikit-learn              1.5.1                    pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
send2trash                1.8.3              pyh0d859eb_0    conda-forge
sentry-sdk                2.11.0                   pypi_0    pypi
setproctitle              1.3.3                    pypi_0    pypi
setuptools                71.0.4             pyhd8ed1ab_0    conda-forge
shapely                   2.0.5                    pypi_0    pypi
six                       1.16.0             pyh6c4a22f_0    conda-forge
sleef                     3.6.1                h3400bea_1    conda-forge
smmap                     5.0.1                    pypi_0    pypi
snappy                    1.2.1                ha2e4443_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
soupsieve                 2.5                pyhd8ed1ab_1    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
statsmodels               0.14.2                   pypi_0    pypi
svt-av1                   2.1.2                hac33072_0    conda-forge
sympy                     1.13.1                   pypi_0    pypi
tbb                       2021.12.0            h434a139_3    conda-forge
terminado                 0.18.1             pyh0d859eb_0    conda-forge
threadpoolctl             3.5.0                    pypi_0    pypi
tifffile                  2024.7.24                pypi_0    pypi
timm                      1.0.7                    pypi_0    pypi
tinycss2                  1.3.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
torch                     2.4.0                    pypi_0    pypi
torchaudio                2.3.1               py310_cu121    pytorch
torchvision               0.19.0                   pypi_0    pypi
tornado                   6.4.1           py310hc51659f_0    conda-forge
tqdm                      4.66.4                   pypi_0    pypi
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
triton                    3.0.0                    pypi_0    pypi
types-python-dateutil     2.9.0.20240316     pyhd8ed1ab_0    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
typing_utils              0.1.0              pyhd8ed1ab_0    conda-forge
tzdata                    2024.1                   pypi_0    pypi
uri-template              1.3.0              pyhd8ed1ab_0    conda-forge
urllib3                   2.2.2              pyhd8ed1ab_1    conda-forge
wandb                     0.17.5                   pypi_0    pypi
wayland                   1.23.0               h5291e77_0    conda-forge
wayland-protocols         1.36                 hd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webcolors                 24.6.0             pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.8.0              pyhd8ed1ab_0    conda-forge
wheel                     0.43.0             pyhd8ed1ab_1    conda-forge
widgetsnbextension        4.0.11             pyhd8ed1ab_0    conda-forge
x264                      1!164.3095           h166bdaf_2    conda-forge
x265                      3.5                  h924138e_3    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.9                hb711507_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxfixes            5.0.3             h7f98852_1004    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.5                h75354e8_4    conda-forge
zipp                      3.19.2             pyhd8ed1ab_0    conda-forge
zlib                      1.3.1                h4ab18f5_1    conda-forge
zlib-ng                   2.0.7                h0b41bf4_0    conda-forge
zstandard                 0.23.0          py310h64cae3c_0    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

Thank you in advance!!!

You’ve installed the CPU-only binary from conda-forge:

pytorch 2.3.1 cpu_mkl_py310h75865b9_100 conda-forge

Install a PyTorch binary with CUDA support and it should work.

Hi @ptrblck, Thank you for your quick response. I uninstalled the cpu version with this command and i am left with:

(DEEPLABCUT) [fmmachta@cn14 bin]$ conda list
# packages in environment at /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
albumentations            1.4.3                    pypi_0    pypi
anyio                     4.4.0              pyhd8ed1ab_0    conda-forge
aom                       3.9.1                hac33072_0    conda-forge
argon2-cffi               23.1.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py310h2372a71_4    conda-forge
arrow                     1.3.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
beautifulsoup4            4.12.3             pyha770c72_0    conda-forge
blas                      1.0                    openblas  
bleach                    6.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.6               hef167b5_0    conda-forge
brotli-python             1.1.0           py310hc6cd4ac_1    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.32.3               h4bc722e_0    conda-forge
c-blosc2                  2.12.0               hb4ffafa_0    conda-forge
ca-certificates           2024.7.4             hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cairo                     1.18.0               hebfffa5_3    conda-forge
cccl                      2.3.2                h2c7f797_0  
certifi                   2024.7.4        py310h06a4308_0  
cffi                      1.16.0          py310h2fee648_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
click                     8.1.7                    pypi_0    pypi
comm                      0.2.2              pyhd8ed1ab_0    conda-forge
contourpy                 1.2.1                    pypi_0    pypi
cuda-cccl                 12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-command-line-tools   12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-compiler             12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-cudart               12.1.105                      0    nvidia
cuda-cudart-dev           12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-cudart-static        12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-cuobjdump            12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-cupti                12.1.105                      0    nvidia
cuda-cupti-static         12.1.62                       0    nvidia/label/cuda-12.1.0
cuda-cuxxfilt             12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-documentation        12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-driver-dev           12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-gdb                  12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-libraries            12.1.0                        0    nvidia
cuda-libraries-dev        12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-libraries-static     12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-nsight               12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nsight-compute       12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-nvcc                 12.1.66                       0    nvidia/label/cuda-12.1.0
cuda-nvdisasm             12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvml-dev             12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvprof               12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvprune              12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvrtc                12.1.105                      0    nvidia
cuda-nvrtc-dev            12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvrtc-static         12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-nvtx                 12.1.105                      0    nvidia
cuda-nvvp                 12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-opencl               12.5.39                       0    nvidia
cuda-opencl-dev           12.1.56                       0    nvidia/label/cuda-12.1.0
cuda-profiler-api         12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-runtime              12.1.0                        0    nvidia
cuda-sanitizer-api        12.1.55                       0    nvidia/label/cuda-12.1.0
cuda-toolkit              12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-tools                12.1.0                        0    nvidia/label/cuda-12.1.0
cuda-version              12.5                          3    nvidia
cuda-visual-tools         12.1.0                        0    nvidia/label/cuda-12.1.0
cycler                    0.12.1                   pypi_0    pypi
dav1d                     1.2.1                hd590300_0    conda-forge
debugpy                   1.8.2           py310h76e45a6_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
deeplabcut                3.0.0rc3                 pypi_0    pypi
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
dlclibrary                0.0.6                    pypi_0    pypi
docker-pycreds            0.4.0                    pypi_0    pypi
einops                    0.8.0                    pypi_0    pypi
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h59595ed_0    conda-forge
ffmpeg                    7.0.1           gpl_h9be9148_104    conda-forge
filelock                  3.15.4                   pypi_0    pypi
filterpy                  1.4.5                    pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_2    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.53.1                   pypi_0    pypi
fqdn                      1.5.1              pyhd8ed1ab_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
fsspec                    2024.6.1                 pypi_0    pypi
gds-tools                 1.6.0.25                      0    nvidia/label/cuda-12.1.0
gettext                   0.22.5               h59595ed_2    conda-forge
gettext-tools             0.22.5               h59595ed_2    conda-forge
gitdb                     4.0.11                   pypi_0    pypi
gitpython                 3.1.43                   pypi_0    pypi
gmp                       6.3.0                hac33072_2    conda-forge
gnutls                    3.7.9                hb077bed_0    conda-forge
graphite2                 1.3.13            h59595ed_1003    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
harfbuzz                  9.0.0                hda332d3_1    conda-forge
hdf5                      1.14.3          nompi_hdf9ad27_105    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
huggingface-hub           0.24.2                   pypi_0    pypi
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       75.1                 he02047a_0    conda-forge
idna                      3.7                pyhd8ed1ab_0    conda-forge
imageio                   2.34.2                   pypi_0    pypi
imageio-ffmpeg            0.5.1                    pypi_0    pypi
imgaug                    0.4.0                    pypi_0    pypi
importlib_resources       6.4.0              pyhd8ed1ab_0    conda-forge
ipykernel                 6.29.5             pyh3099207_0    conda-forge
ipython                   8.26.0             pyh707e725_0    conda-forge
ipython_genutils          0.2.0              pyhd8ed1ab_1    conda-forge
ipywidgets                8.1.3              pyhd8ed1ab_0    conda-forge
isoduration               20.11.0            pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
joblib                    1.4.2                    pypi_0    pypi
jsonpointer               3.0.0           py310hff52083_0    conda-forge
jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
jupyter                   1.0.0             pyhd8ed1ab_10    conda-forge
jupyter_client            7.4.9              pyhd8ed1ab_0    conda-forge
jupyter_console           6.6.3              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2           py310hff52083_0    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_widgets        3.0.11             pyhd8ed1ab_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5                    pypi_0    pypi
krb5                      1.21.3               h659f571_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lazy-loader               0.4                      pypi_0    pypi
ld_impl_linux-64          2.40                 hf3520f5_7    conda-forge
libabseil                 20240116.2      cxx17_he02047a_1    conda-forge
libaec                    1.1.3                h59595ed_0    conda-forge
libasprintf               0.22.5               h661eb56_2    conda-forge
libasprintf-devel         0.22.5               h661eb56_2    conda-forge
libass                    0.17.1               h39113c1_2    conda-forge
libblas                   3.9.0           23_linux64_openblas    conda-forge
libcblas                  3.9.0           23_linux64_openblas    conda-forge
libcublas                 12.1.0.26                     0    nvidia
libcublas-dev             12.1.0.26                     0    nvidia/label/cuda-12.1.0
libcublas-static          12.1.0.26                     0    nvidia/label/cuda-12.1.0
libcufft                  11.0.2.4                      0    nvidia
libcufft-dev              11.0.2.4                      0    nvidia/label/cuda-12.1.0
libcufft-static           11.0.2.4                      0    nvidia/label/cuda-12.1.0
libcufile                 1.10.1.7                      0    nvidia
libcufile-dev             1.6.0.25                      0    nvidia/label/cuda-12.1.0
libcufile-static          1.6.0.25                      0    nvidia/label/cuda-12.1.0
libcurand                 10.3.6.82                     0    nvidia
libcurand-dev             10.3.2.56                     0    nvidia/label/cuda-12.1.0
libcurand-static          10.3.2.56                     0    nvidia/label/cuda-12.1.0
libcurl                   8.9.0                hdb1bdb2_0    conda-forge
libcusolver               11.4.4.55                     0    nvidia
libcusolver-dev           11.4.4.55                     0    nvidia/label/cuda-12.1.0
libcusolver-static        11.4.4.55                     0    nvidia/label/cuda-12.1.0
libcusparse               12.0.2.55                     0    nvidia
libcusparse-dev           12.0.2.55                     0    nvidia/label/cuda-12.1.0
libcusparse-static        12.0.2.55                     0    nvidia/label/cuda-12.1.0
libdrm                    2.4.122              h4ab18f5_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgettextpo              0.22.5               h59595ed_2    conda-forge
libgettextpo-devel        0.22.5               h59595ed_2    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libglib                   2.80.3               h8a4344b_1    conda-forge
libhwloc                  2.11.1          default_hecaa2ac_1000    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libidn2                   2.3.7                hd590300_0    conda-forge
liblapack                 3.9.0           23_linux64_openblas    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnpp                    12.0.2.50                     0    nvidia
libnpp-dev                12.0.2.50                     0    nvidia/label/cuda-12.1.0
libnpp-static             12.0.2.50                     0    nvidia/label/cuda-12.1.0
libnsl                    2.0.1                hd590300_0    conda-forge
libnvjitlink              12.1.105                      0    nvidia
libnvjitlink-dev          12.1.55                       0    nvidia/label/cuda-12.1.0
libnvjpeg                 12.1.1.14                     0    nvidia
libnvjpeg-dev             12.1.0.39                     0    nvidia/label/cuda-12.1.0
libnvjpeg-static          12.1.0.39                     0    nvidia/label/cuda-12.1.0
libnvvm-samples           12.1.55                       0    nvidia/label/cuda-12.1.0
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libopenvino               2024.2.0             h2da1b83_1    conda-forge
libopenvino-auto-batch-plugin 2024.2.0             hb045406_1    conda-forge
libopenvino-auto-plugin   2024.2.0             hb045406_1    conda-forge
libopenvino-hetero-plugin 2024.2.0             h5c03a75_1    conda-forge
libopenvino-intel-cpu-plugin 2024.2.0             h2da1b83_1    conda-forge
libopenvino-intel-gpu-plugin 2024.2.0             h2da1b83_1    conda-forge
libopenvino-intel-npu-plugin 2024.2.0             he02047a_1    conda-forge
libopenvino-ir-frontend   2024.2.0             h5c03a75_1    conda-forge
libopenvino-onnx-frontend 2024.2.0             h07e8aee_1    conda-forge
libopenvino-paddle-frontend 2024.2.0             h07e8aee_1    conda-forge
libopenvino-pytorch-frontend 2024.2.0             he02047a_1    conda-forge
libopenvino-tensorflow-frontend 2024.2.0             h39126c6_1    conda-forge
libopenvino-tensorflow-lite-frontend 2024.2.0             he02047a_1    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpciaccess              0.18                 hd590300_0    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libprotobuf               4.25.3               h08a7969_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libtasn1                  4.19.0               h166bdaf_0    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libva                     2.22.0               hb711507_0    conda-forge
libvpx                    1.14.1               hac33072_0    conda-forge
libxcb                    1.16                 hd590300_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.7               he7c6b58_4    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
llvm-openmp               18.1.8               hf5423f3_0    conda-forge
llvmlite                  0.43.0                   pypi_0    pypi
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              hd590300_1001    conda-forge
markupsafe                2.1.5           py310h2372a71_0    conda-forge
matplotlib                3.8.4                    pypi_0    pypi
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mistune                   3.0.2              pyhd8ed1ab_0    conda-forge
nb_conda                  2.2.1                    unix_7    conda-forge
nb_conda_kernels          2.5.1              pyh707e725_2    conda-forge
nbclassic                 1.1.0              pyhd8ed1ab_0    conda-forge
nbclient                  0.10.0             pyhd8ed1ab_0    conda-forge
nbconvert                 7.16.4               hd8ed1ab_1    conda-forge
nbconvert-core            7.16.4             pyhd8ed1ab_1    conda-forge
nbconvert-pandoc          7.16.4               hd8ed1ab_1    conda-forge
nbformat                  5.10.4             pyhd8ed1ab_0    conda-forge
ncurses                   6.5                  h59595ed_0    conda-forge
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
nettle                    3.9.1                h7ab15ed_0    conda-forge
nomkl                     3.0                           0  
notebook                  6.5.7              pyha770c72_0    conda-forge
notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge
nsight-compute            2023.1.0.15                   0    nvidia/label/cuda-12.1.0
numba                     0.60.0                   pypi_0    pypi
numexpr                   2.10.0          py310h3ea09b0_100    conda-forge
numpy                     1.26.4          py310hb13e2d6_0    conda-forge
nvidia-cublas-cu12        12.5.3.2                 pypi_0    pypi
nvidia-cudnn-cu12         9.2.1.18                 pypi_0    pypi
ocl-icd                   2.3.2                hd590300_1    conda-forge
opencv-python             4.10.0.84                pypi_0    pypi
opencv-python-headless    4.10.0.84                pypi_0    pypi
openh264                  2.4.1                h59595ed_0    conda-forge
openssl                   3.3.1                h4bc722e_2    conda-forge
overrides                 7.7.0              pyhd8ed1ab_0    conda-forge
p11-kit                   0.24.1               hc5aa10d_0    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2                    pypi_0    pypi
pandoc                    3.2.1                ha770c72_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
patsy                     0.5.6                    pypi_0    pypi
pcre2                     10.44                h0f59acf_0    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pixman                    0.43.2               h59595ed_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.2.2              pyhd8ed1ab_0    conda-forge
prometheus_client         0.20.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.47             pyha770c72_0    conda-forge
prompt_toolkit            3.0.47               hd8ed1ab_0    conda-forge
protobuf                  5.27.2                   pypi_0    pypi
psutil                    6.0.0           py310hc51659f_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pugixml                   1.14                 h59595ed_0    conda-forge
pure_eval                 0.2.3              pyhd8ed1ab_0    conda-forge
py-cpuinfo                9.0.0              pyhd8ed1ab_0    conda-forge
pycocotools               2.0.8                    pypi_0    pypi
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.2                    pypi_0    pypi
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytables                  3.8.0           py310h374b01c_4    conda-forge
python                    3.10.14         hd12c33a_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.20.0             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pytorch-cuda              12.1                 ha16c6d3_5    pytorch
pytz                      2024.1                   pypi_0    pypi
pyyaml                    6.0.1           py310h2372a71_1    conda-forge
pyzmq                     26.0.3          py310h6883aea_0    conda-forge
qtconsole-base            5.5.2              pyha770c72_0    conda-forge
qtpy                      2.4.1              pyhd8ed1ab_0    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1             pyhd8ed1ab_0    conda-forge
requests                  2.32.3             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
rpds-py                   0.19.1          py310h42e942d_0    conda-forge
ruamel-yaml               0.18.6                   pypi_0    pypi
ruamel-yaml-clib          0.2.8                    pypi_0    pypi
safetensors               0.4.3                    pypi_0    pypi
scikit-image              0.24.0                   pypi_0    pypi
scikit-learn              1.5.1                    pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
send2trash                1.8.3              pyh0d859eb_0    conda-forge
sentry-sdk                2.11.0                   pypi_0    pypi
setproctitle              1.3.3                    pypi_0    pypi
setuptools                71.0.4             pyhd8ed1ab_0    conda-forge
shapely                   2.0.5                    pypi_0    pypi
six                       1.16.0             pyh6c4a22f_0    conda-forge
smmap                     5.0.1                    pypi_0    pypi
snappy                    1.2.1                ha2e4443_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
soupsieve                 2.5                pyhd8ed1ab_1    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
statsmodels               0.14.2                   pypi_0    pypi
svt-av1                   2.1.2                hac33072_0    conda-forge
sympy                     1.13.1                   pypi_0    pypi
tbb                       2021.12.0            h434a139_3    conda-forge
terminado                 0.18.1             pyh0d859eb_0    conda-forge
threadpoolctl             3.5.0                    pypi_0    pypi
tifffile                  2024.7.24                pypi_0    pypi
timm                      1.0.7                    pypi_0    pypi
tinycss2                  1.3.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
torch                     2.4.0                    pypi_0    pypi
torchvision               0.19.0                   pypi_0    pypi
tornado                   6.4.1           py310hc51659f_0    conda-forge
tqdm                      4.66.4                   pypi_0    pypi
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
triton                    3.0.0                    pypi_0    pypi
types-python-dateutil     2.9.0.20240316     pyhd8ed1ab_0    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
typing_utils              0.1.0              pyhd8ed1ab_0    conda-forge
tzdata                    2024.1                   pypi_0    pypi
uri-template              1.3.0              pyhd8ed1ab_0    conda-forge
urllib3                   2.2.2              pyhd8ed1ab_1    conda-forge
wandb                     0.17.5                   pypi_0    pypi
wayland                   1.23.0               h5291e77_0    conda-forge
wayland-protocols         1.36                 hd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webcolors                 24.6.0             pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.8.0              pyhd8ed1ab_0    conda-forge
wheel                     0.43.0             pyhd8ed1ab_1    conda-forge
widgetsnbextension        4.0.11             pyhd8ed1ab_0    conda-forge
x264                      1!164.3095           h166bdaf_2    conda-forge
x265                      3.5                  h924138e_3    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.9                hb711507_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxfixes            5.0.3             h7f98852_1004    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.5                h75354e8_4    conda-forge
zipp                      3.19.2             pyhd8ed1ab_0    conda-forge
zlib                      1.3.1                h4ab18f5_1    conda-forge
zlib-ng                   2.0.7                h0b41bf4_0    conda-forge
zstandard                 0.23.0          py310h64cae3c_0    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

i notice that pytorch-cuda is left, is that the pytorch with cuda binary you are reffering to?

I continued to attempt again:

(DEEPLABCUT) [fmmachta@cn14 bin]$ 
(DEEPLABCUT) [fmmachta@cn14 bin]$ ipython
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]

In [1]: import torch
In [2]: torch.cuda.is_available()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 torch.cuda.is_available()

AttributeError: module 'torch' has no attribute 'cuda'

In [3]: exit

I will try to reattempt to install it. I am using the command from the front page of pytorch.com:

(DEEPLABCUT) fmmachta@commander /apps/local/cuda/bin $ conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
Channels:
 - pytorch
 - nvidia
 - defaults
 - conda-forge
 - nvidia/label/cuda-12.1.0
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT

  added / updated specs:
    - pytorch
    - pytorch-cuda=12.1
    - torchaudio
    - torchvision


The following NEW packages will be INSTALLED:

  filelock           pkgs/main/linux-64::filelock-3.13.1-py310h06a4308_0 
  fsspec             pkgs/main/linux-64::fsspec-2024.3.1-py310h06a4308_0 
  gmpy2              pkgs/main/linux-64::gmpy2-2.1.2-py310heeb90bb_0 
  lcms2              conda-forge/linux-64::lcms2-2.16-hb7c19ff_0 
  lerc               conda-forge/linux-64::lerc-4.0.0-h27087fc_0 
  libdeflate         conda-forge/linux-64::libdeflate-1.20-hd590300_0 
  libjpeg-turbo      pkgs/main/linux-64::libjpeg-turbo-3.0.3-h5eee18b_0 
  libtiff            conda-forge/linux-64::libtiff-4.6.0-h1dd3fc0_3 
  libtorch           conda-forge/linux-64::libtorch-2.3.1-cpu_mkl_h0bb0d08_100 
  libuv              pkgs/main/linux-64::libuv-1.48.0-h5eee18b_0 
  libwebp-base       conda-forge/linux-64::libwebp-base-1.4.0-hd590300_0 
  mkl                conda-forge/linux-64::mkl-2023.2.0-h84fe81f_50496 
  mpc                pkgs/main/linux-64::mpc-1.1.0-h10f8cd9_1 
  mpfr               pkgs/main/linux-64::mpfr-4.0.2-hb69a4c5_1 
  mpmath             pkgs/main/linux-64::mpmath-1.3.0-py310h06a4308_0 
  networkx           pkgs/main/linux-64::networkx-3.3-py310h06a4308_0 
  openjpeg           conda-forge/linux-64::openjpeg-2.5.2-h488ebb8_0 
  pillow             conda-forge/linux-64::pillow-10.4.0-py310hebfe307_0 
  pytorch            conda-forge/linux-64::pytorch-2.3.1-cpu_mkl_py310h75865b9_100 
  pytorch-mutex      pytorch/noarch::pytorch-mutex-1.0-cuda 
  sleef              conda-forge/linux-64::sleef-3.6.1-h3400bea_1 
  sympy              pkgs/main/linux-64::sympy-1.12-py310h06a4308_0 
  torchaudio         pytorch/linux-64::torchaudio-2.3.1-py310_cu121 
  torchvision        pytorch/linux-64::torchvision-0.18.1-py310_cu121 


Proceed ([y]/n)? 


This is what i did before and it does not seem to be working as it is installing the CPU version, so i clicked n. 

How else should i install the GPU binary version. THANKS AGAIN!!

Could you remove the conda-forge channel from your current env and depend on the main and pytorch channel only?
If not, try to install the pip wheels instead.

Hello @ptrblck, Thank you again for the response. I tried hard to convert the conda-forge files to pip dependencies, but it didnt work out. I ended up installing a fresh install of Deeplabcut into a new conda enviornment, and i noticed that most of the dependencies are pip. Here is the .yaml file i installed from:



# DEEPLABCUT.yaml

#DeepLabCut Toolbox (deeplabcut.org)
#© A. & M.W. Mathis Labs
#https://github.com/DeepLabCut/DeepLabCut
#Please see AUTHORS for contributors.

#https://github.com/DeepLabCut/DeepLabCut/blob/main/AUTHORS
#Licensed under GNU Lesser General Public License v3.0
#
# DeepLabCut environment
# FIRST: INSTALL CORRECT DRIVER for GPU, see https://stackoverflow.com/questions/30820513/what-is-the-correct-version-of-cuda-for-my-nvidia-driver/30820690
#
# AFTER THIS FILE IS INSTALLED, if you have a GPU be sure to install cudnn from conda-forge: conda install cudnn -c conda-forge
#
# install: conda env create -f DEEPLABCUT.yaml
# update:  conda env update -f DEEPLABCUT.yaml
name: DEEPLABCUT
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.10
  - pip
  - ipython
  - jupyter
  - nb_conda
  - notebook<7.0.0
  - ffmpeg
  - pytables==3.8.0
  - pip:
    - "git+https://github.com/DeepLabCut/DeepLabCut.git@pytorch_dlc#egg=deeplabcut[modelzoo,wandb]"

Here are my logs for the install:

(base) fmmachta@commander ~ $ conda env create -f DEEPLABCUT.yaml
Channels:
 - conda-forge
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: \ Enabling nb_conda_kernels...
CONDA_PREFIX: /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT
Status: enabled

- Enabling notebook extension nb_conda/main...
      - Validating: OK
Enabling tree extension nb_conda/tree...
      - Validating: OK
Enabling: nb_conda
- Writing config: /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/etc/jupyter
    - Validating...
      nb_conda 2.2.1 OK

done
Installing pip dependencies: \ Ran pip subprocess with arguments:
['/data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/fmmachta/condaenv.m8z9q4u0.requirements.txt', '--exists-action=b']
Pip subprocess output:

Successfully built deeplabcut
Installing collected packages: pytz, mpmath, tzdata, tqdm, tifffile, threadpoolctl, sympy, smmap, Shapely, setproctitle, sentry-sdk, scipy, safetensors, ruamel.yaml.clib, pyparsing, protobuf, Pillow, patsy, opencv-python-headless, opencv-python, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, llvmlite, lazy-loader, kiwisolver, joblib, imageio-ffmpeg, fsspec, fonttools, filelock, einops, docker-pycreds, cycler, contourpy, click, triton, scikit-learn, ruamel.yaml, pandas, nvidia-cusparse-cu12, nvidia-cudnn-cu12, numba, matplotlib, imageio, huggingface-hub, gitdb, statsmodels, scikit-image, pycocotools, nvidia-cusolver-cu12, gitpython, filterpy, dlclibrary, wandb, torch, imgaug, albumentations, torchvision, timm, deeplabcut
Successfully installed Pillow-10.4.0 Shapely-2.0.5 albumentations-1.4.3 click-8.1.7 contourpy-1.2.1 cycler-0.12.1 deeplabcut-3.0.0rc3 dlclibrary-0.0.6 docker-pycreds-0.4.0 einops-0.8.0 filelock-3.15.4 filterpy-1.4.5 fonttools-4.53.1 fsspec-2024.6.1 gitdb-4.0.11 gitpython-3.1.43 huggingface-hub-0.24.2 imageio-2.34.2 imageio-ffmpeg-0.5.1 imgaug-0.4.0 joblib-1.4.2 kiwisolver-1.4.5 lazy-loader-0.4 llvmlite-0.43.0 matplotlib-3.8.4 mpmath-1.3.0 networkx-3.3 numba-0.60.0 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.82 nvidia-nvtx-cu12-12.1.105 opencv-python-4.10.0.84 opencv-python-headless-4.10.0.84 pandas-2.2.2 patsy-0.5.6 protobuf-5.27.2 pycocotools-2.0.8 pyparsing-3.1.2 pytz-2024.1 ruamel.yaml-0.18.6 ruamel.yaml.clib-0.2.8 safetensors-0.4.3 scikit-image-0.24.0 scikit-learn-1.5.1 scipy-1.10.1 sentry-sdk-2.11.0 setproctitle-1.3.3 smmap-5.0.1 statsmodels-0.14.2 sympy-1.13.1 threadpoolctl-3.5.0 tifffile-2024.7.24 timm-1.0.7 torch-2.4.0 torchvision-0.19.0 tqdm-4.66.4 triton-3.0.0 tzdata-2024.1 wandb-0.17.5

done
#
# To activate this environment, use
#
#     $ conda activate DEEPLABCUT
#
# To deactivate an active environment, use
#
#     $ conda deactivate

(base) fmmachta@commander ~ $ conda activate DEEPLABCUt

EnvironmentNameNotFound: Could not find conda environment: DEEPLABCUt
You can list all discoverable environments with `conda info --envs`.


(base) fmmachta@commander ~ $ conda activate DEEPLABCUT
(DEEPLABCUT) fmmachta@commander ~ $ pip list
Package                   Version
------------------------- --------------
albumentations            1.4.3
anyio                     4.4.0
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 2.4.1
attrs                     23.2.0
beautifulsoup4            4.12.3
bleach                    6.1.0
Brotli                    1.1.0
cached-property           1.5.2
certifi                   2024.7.4
cffi                      1.16.0
charset-normalizer        3.3.2
click                     8.1.7
comm                      0.2.2
contourpy                 1.2.1
cycler                    0.12.1
debugpy                   1.8.2
decorator                 5.1.1
deeplabcut                3.0.0rc3
defusedxml                0.7.1
dlclibrary                0.0.6
docker-pycreds            0.4.0
einops                    0.8.0
entrypoints               0.4
exceptiongroup            1.2.2
executing                 2.0.1
fastjsonschema            2.20.0
filelock                  3.15.4
filterpy                  1.4.5
fonttools                 4.53.1
fqdn                      1.5.1
fsspec                    2024.6.1
gitdb                     4.0.11
GitPython                 3.1.43
h2                        4.1.0
hpack                     4.0.0
huggingface-hub           0.24.2
hyperframe                6.0.1
idna                      3.7
imageio                   2.34.2
imageio-ffmpeg            0.5.1
imgaug                    0.4.0
importlib_resources       6.4.0
ipykernel                 6.29.5
ipython                   8.26.0
ipython_genutils          0.2.0
ipywidgets                8.1.3
isoduration               20.11.0
jedi                      0.19.1
Jinja2                    3.1.4
joblib                    1.4.2
jsonpointer               3.0.0
jsonschema                4.23.0
jsonschema-specifications 2023.12.1
jupyter                   1.0.0
jupyter_client            7.4.9
jupyter-console           6.6.3
jupyter_core              5.7.2
jupyter-events            0.10.0
jupyter_server            2.14.2
jupyter_server_terminals  0.5.3
jupyterlab_pygments       0.3.0
jupyterlab_widgets        3.0.11
kiwisolver                1.4.5
lazy_loader               0.4
llvmlite                  0.43.0
MarkupSafe                2.1.5
matplotlib                3.8.4
matplotlib-inline         0.1.7
mistune                   3.0.2
mpmath                    1.3.0
nb_conda                  2.2.1
nb_conda_kernels          2.5.1
nbclassic                 1.1.0
nbclient                  0.10.0
nbconvert                 7.16.4
nbformat                  5.10.4
nest_asyncio              1.6.0
networkx                  3.3
notebook                  6.5.7
notebook_shim             0.2.4
numba                     0.60.0
numexpr                   2.10.0
numpy                     1.26.4
nvidia-cublas-cu12        12.1.3.1
nvidia-cuda-cupti-cu12    12.1.105
nvidia-cuda-nvrtc-cu12    12.1.105
nvidia-cuda-runtime-cu12  12.1.105
nvidia-cudnn-cu12         9.1.0.70
nvidia-cufft-cu12         11.0.2.54
nvidia-curand-cu12        10.3.2.106
nvidia-cusolver-cu12      11.4.5.107
nvidia-cusparse-cu12      12.1.0.106
nvidia-nccl-cu12          2.20.5
nvidia-nvjitlink-cu12     12.5.82
nvidia-nvtx-cu12          12.1.105
opencv-python             4.10.0.84
opencv-python-headless    4.10.0.84
overrides                 7.7.0
packaging                 24.1
pandas                    2.2.2
pandocfilters             1.5.0
parso                     0.8.4
patsy                     0.5.6
pexpect                   4.9.0
pickleshare               0.7.5
pillow                    10.4.0
pip                       24.0
pkgutil_resolve_name      1.3.10
platformdirs              4.2.2
prometheus_client         0.20.0
prompt_toolkit            3.0.47
protobuf                  5.27.2
psutil                    6.0.0
ptyprocess                0.7.0
pure_eval                 0.2.3
py-cpuinfo                9.0.0
pycocotools               2.0.8
pycparser                 2.22
Pygments                  2.18.0
pyparsing                 3.1.2
PySocks                   1.7.1
python-dateutil           2.9.0
python-json-logger        2.0.7
pytz                      2024.1
PyYAML                    6.0.1
pyzmq                     26.0.3
qtconsole                 5.5.2
QtPy                      2.4.1
referencing               0.35.1
requests                  2.32.3
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rpds-py                   0.19.1
ruamel.yaml               0.18.6
ruamel.yaml.clib          0.2.8
safetensors               0.4.3
scikit-image              0.24.0
scikit-learn              1.5.1
scipy                     1.10.1
Send2Trash                1.8.3
sentry-sdk                2.11.0
setproctitle              1.3.3
setuptools                71.0.4
shapely                   2.0.5
six                       1.16.0
smmap                     5.0.1
sniffio                   1.3.1
soupsieve                 2.5
stack-data                0.6.2
statsmodels               0.14.2
sympy                     1.13.1
tables                    3.8.0
terminado                 0.18.1
threadpoolctl             3.5.0
tifffile                  2024.7.24
timm                      1.0.7
tinycss2                  1.3.0
torch                     2.4.0
torchvision               0.19.0
tornado                   6.4.1
tqdm                      4.66.4
traitlets                 5.14.3
triton                    3.0.0
types-python-dateutil     2.9.0.20240316
typing_extensions         4.12.2
typing-utils              0.1.0
tzdata                    2024.1
uri-template              1.3.0
urllib3                   2.2.2
wandb                     0.17.5
wcwidth                   0.2.13
webcolors                 24.6.0
webencodings              0.5.1
websocket-client          1.8.0
wheel                     0.43.0
widgetsnbextension        4.0.11
zipp                      3.19.2
zstandard                 0.23.0

As always I really do appreciate the help. Forgive me if i do misunderstand your request. If there is anything that i am doing wrong, please correct me. Thank you.

Continuing off of my previous message, attempting to import torch and checking for GPU devices gives me this:

(DEEPLABCUT) fmmachta@commander ~ $ srun --nodes=1 --ntasks=1 --cpus-per-task=4 --mem=16GB --time=04:00:00 --gres=gpu:1 --nodelist=cn14 --pty bash
'abrt-cli status' timed out
(DEEPLABCUT) [fmmachta@cn14 ~]$ nvidia-smi
Thu Jul 25 23:26:14 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P100-PCIE-16GB            Off| 00000000:08:00.0 Off |                    0 |
| N/A   27C    P0               27W / 250W|      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE-16GB            Off| 00000000:84:00.0 Off |                    0 |
| N/A   29C    P0               27W / 250W|      0MiB / 16384MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
w(DEEPLABCUT) [fmmachta@cn14 ~]$ which nvidia-smi
/usr/bin/nvidia-smi
(DEEPLABCUT) [fmmachta@cn14 ~]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
(DEEPLABCUT) [fmmachta@cn14 ~]$ which nvcc
/apps/local/cuda/bin/nvcc
(DEEPLABCUT) [fmmachta@cn14 ~]$ ipython
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch

In [2]: torch.cuda.is_available()
/data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/lib/python3.10/site-packages/torch/cuda/__init__.py:128: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
Out[2]: False

In [3]: torch.cuda.device_count()
Out[3]: 1

In [4]: torch.cuda.current_device()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[4], line 1
----> 1 torch.cuda.current_device()

File /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/lib/python3.10/site-packages/torch/cuda/__init__.py:878, in current_device()
    876 def current_device() -> int:
    877     r"""Return the index of a currently selected device."""
--> 878     _lazy_init()
    879     return torch._C._cuda_getDevice()

File /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/lib/python3.10/site-packages/torch/cuda/__init__.py:314, in _lazy_init()
    312 if "CUDA_MODULE_LOADING" not in os.environ:
    313     os.environ["CUDA_MODULE_LOADING"] = "LAZY"
--> 314 torch._C._cuda_init()
    315 # Some of the queued calls may reentrantly call _lazy_init();
    316 # we need to just return without initializing in that case.
    317 # However, we must not let any *other* threads in!
    318 _tls.is_initializing = True

RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.

In [5]: torch.cuda.device(0)
Out[5]: <torch.cuda.device at 0x2b9fd809d090>

In [6]: torch.cuda.get_device_name(0)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[6], line 1
----> 1 torch.cuda.get_device_name(0)

File /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/lib/python3.10/site-packages/torch/cuda/__init__.py:435, in get_device_name(device)
    423 def get_device_name(device: Optional[_device_t] = None) -> str:
    424     r"""Get the name of a device.
    425 
    426     Args:
   (...)
    433         str: the name of the device
    434     """
--> 435     return get_device_properties(device).name

File /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/lib/python3.10/site-packages/torch/cuda/__init__.py:465, in get_device_properties(device)
    455 def get_device_properties(device: _device_t) -> _CudaDeviceProperties:
    456     r"""Get the properties of a device.
    457 
    458     Args:
   (...)
    463         _CudaDeviceProperties: the properties of the device
    464     """
--> 465     _lazy_init()  # will define _get_device_properties
    466     device = _get_device_index(device, optional=True)
    467     if device < 0 or device >= device_count():

File /data03/home/fmmachta/miniconda3/envs/DEEPLABCUT/lib/python3.10/site-packages/torch/cuda/__init__.py:314, in _lazy_init()
    312 if "CUDA_MODULE_LOADING" not in os.environ:
    313     os.environ["CUDA_MODULE_LOADING"] = "LAZY"
--> 314 torch._C._cuda_init()
    315 # Some of the queued calls may reentrantly call _lazy_init();
    316 # we need to just return without initializing in that case.
    317 # However, we must not let any *other* threads in!
    318 _tls.is_initializing = True

RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.

In [7]: 

What does torch.__version__ and torch.version.cuda return after installing the new binaries?

These are my torch versions:

(DEEPLABCUT) fmmachta@commander ~ $ srun --nodes=1 --ntasks=1 --cpus-per-task=4 --mem=16GB --time=04:00:00 --gres=gpu:1 --nodelist=cn14 --pty bash
'abrt-cli status' timed out
(base) [fmmachta@cn14 ~]$ conda activate DEEPLABCUT
(DEEPLABCUT) [fmmachta@cn14 ~]$ ipython
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch

In [2]: torch.__version__
Out[2]: '2.4.0+cu121'

In [3]: torch.version.cuda
Out[3]: '12.1'

OK, good! The right binaries are already installed. Are you able to run any of the cuda samples from e.g. here?

Hi @ptrblck.

I downloaded the tar.gz file for cuda 12.1 from the website you sent me. i scp it to the server, and unpackaged it. I then cd into sample, then 0_introduction folder, and “made” and ran several of them. Here are some of their outputs:

(DEEPLABCUT) [fmmachta@cn14 vectorAdd]$ make
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common  -m64    --threads 0 --std=c++11 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o vectorAdd.o -c vectorAdd.cu
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o vectorAdd vectorAdd.o 
mkdir -p ../../../bin/x86_64/linux/release
cp vectorAdd ../../../bin/x86_64/linux/release
(DEEPLABCUT) [fmmachta@cn14 vectorAdd]$ cd ../../../bin/x86_64/linux/release
(DEEPLABCUT) [fmmachta@cn14 release]$ ./vectorAdd
[Vector addition of 50000 elements]
Failed to allocate device vector A (error code unknown error)!
(DEEPLABCUT) [fmmachta@cn14 release]$ ./simpleZeroCopy
CUDA error at ../../../Common/helper_cuda.h:951 code=999(cudaErrorUnknown) "cudaGetDevice(&dev)" 
(DEEPLABCUT) [fmmachta@cn14 release]$ ./asyncAPI
[./asyncAPI] - Starting...
CUDA error at ../../../Common/helper_cuda.h:801 code=999(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)" 
(DEEPLABCUT) [fmmachta@cn14 release]$ ./clock
CUDA Clock sample
CUDA error at ../../../Common/helper_cuda.h:801 code=999(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)" 
DEEPLABCUT) [fmmachta@cn14 release]$ ./clock_nvrtc
CUDA Clock sample
checkCudaErrors() Driver API error = 0003 "initialization error" from file <../../../Common/helper_cuda_drvapi.h>, line 207.

Thank you!

Thanks for the check!
Since none of the CUDA samples are executing properly, it sounds as if your setup is not working correctly.

This section sounds as if this server is managed by another team and I would expect to see at least simple CUDA samples working.
Since these are already failing, I would suggest to ask them to run any smoke test on these systems since apparently you cannot even reinstall the driver.
Once the CUDA samples work fine (e.g. in case the driver needs to be reinstalled), your PyTorch installation should also work since you are now using the CUDA-enabled PyTorch binaries.

thank you for everything @ptrblck. I’m a student at a university and I don’t have any root privileges to be able to install drivers. do you think that it’s anything that i can somehow install the drivers locally to run within my conda enviornment? Do you think that also i might be using SLURM wrong? I am using it through an interactive session. there are also modules available in the node.

(base) fmmachta@commander ~ $ srun --nodes=1 --ntasks=1 --cpus-per-task=4 --mem=16GB --time=04:00:00 --nodelist=cn14 --gres=gpu:2 --pty bash
'abrt-cli status' timed out
(base) [fmmachta@cn14 ~]$ conda activate DEEPLABCUT
(DEEPLABCUT) [fmmachta@cn14 ~]$ nvidia-smi
Sat Jul 27 07:57:42 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P100-PCIE-16GB            Off| 00000000:08:00.0 Off |                    0 |
| N/A   27C    P0               27W / 250W|      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE-16GB            Off| 00000000:84:00.0 Off |                    0 |
| N/A   29C    P0               28W / 250W|      0MiB / 16384MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
(DEEPLABCUT) [fmmachta@cn14 ~]$ module avail

------------------------ /usr/share/Modules/modulefiles ------------------------
dot         module-info modules     use.own

------------------------------- /etc/modulefiles -------------------------------
anaconda3/2022.05            openmpi/1.10.3
cuda/10.0                    openmpi/1.10.3-cuda
cuda/10.1                    openmpi/2.1.1-cuda
cuda/10.2                    openmpi/2.1.2
cuda/11.0                    openmpi/3.0.0rc6
cuda/11.1                    orca/3.0.3
cuda/11.2                    orca/4.0.0
cuda/11.3                    petsc/3.7.6
cuda/11.4                    python/2/pandas/0.20.3
cuda/8                       python/3/mpi4py/3.0.0
cuda/8.0                     python/3/pandas/0.20.3
cuda/9.0                     python/3/pycuda/2019.1.1
dlib/19.6                    python/3/scikit-learn/19.0
GPU/anaconda3/24             python/3/tensorflow/1.2.1
GPU/cuda/cuda-11.5.1         quantum_espresso/6.1
GPU/cuda/cuda-11.6           R/open/3.4.0
GPU/cuda/cuda-11.7           R/open/4.2.1
GPU/cuda/cuda-11.8           ruby/3.1.2
GPU/cuda/cuda-12.1           scl/devtoolset/8
GPU/cuda/cudnn-8.3.1.22-11.5 singularity/2.3.1
GPU/gcc/gcc-11.2.0           system/opencv/2.4.5
GPU/openACC/22.3             system/perl/5.16.3
GPU/opencv/opencv-4.5.5      system/perl/bioperl/1.7.1
GPU/openmpi/4.1.3            system/python/2.7.5
GPU/python3/python-3.9.12    system/python/3.4.5
gromacs/2018.1               system/python/keras/2.0.6
java/oracle/8                system/R/3.4.0
julia/1.2.0                  visit/2.13.2
mpi/mpich-3.2-x86_64
(DEEPLABCUT) [fmmachta@cn14 ~]$ module load GPU/cuda/cuda-11.8
(DEEPLABCUT) [fmmachta@cn14 ~]$ module list
Currently Loaded Modulefiles:
  1) cuda-11.8
(DEEPLABCUT) [fmmachta@cn14 ~]$ nvidia-smi
Sat Jul 27 07:58:28 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P100-PCIE-16GB            Off| 00000000:08:00.0 Off |                    0 |
| N/A   27C    P0               27W / 250W|      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE-16GB            Off| 00000000:84:00.0 Off |                    0 |
| N/A   29C    P0               27W / 250W|      0MiB / 16384MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

I think i have no choice but to contact administrators. Thank you for all of your help!!!