Pytorch uses wrong cuda version

Hello everybody,

PyTorch seems to use the wrong cuda version.
I create a fresh conda environment with

conda create -n myenv

Then in this environment I install torch via

conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge

Afterwards if I start python in this environment and import torch, torch.version yields ‘1.11.0+cu102’.

Other info:
nvidia-smi shows

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02    Driver Version: 510.60.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:26:00.0  On |                  N/A |
| 43%   36C    P8     9W / 125W |   1420MiB /  6144MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1655      G   /usr/lib/Xorg                     174MiB |
|    0   N/A  N/A      1736      G   /usr/bin/gnome-shell               64MiB |
|    0   N/A  N/A     11012      C   /usr/bin/python                   933MiB |
|    0   N/A  N/A     30207      G   /usr/lib/firefox/firefox          166MiB |
|    0   N/A  N/A    100867      G   ...b/thunderbird/thunderbird       64MiB |
|    0   N/A  N/A    149129      G   ...AAAAAAAAA= --shared-files       10MiB |
+-----------------------------------------------------------------------------+

and nvcc --version shows

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Thu_Feb_10_18:23:41_PST_2022
Cuda compilation tools, release 11.6, V11.6.112
Build cuda_11.6.r11.6/compiler.30978841_0

Thanks for any help.

Could you post the install log showing which PyTorch binary was picked as well as the output of pip list | grep torch and conda list | grep torch?

That response was quick :smiley:

Install log shows:

pytorch            pytorch/linux-64::pytorch-1.12.0-py3.9_cuda11.6_cudnn8.3.2_0

pip list | grep torch shows

torch                             1.11.0
torchaudio                        0.12.0
torchio                           0.18.75
torchvision                       0.12.0

and conda list | grep torch

ffmpeg                    4.3                  hf484d3e_0    pytorch
pytorch                   1.12.0          py3.9_cuda11.6_cudnn8.3.2_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
torchaudio                0.12.0               py39_cu116    pytorch
torchvision               0.13.0               py39_cu116    pytorch

Thanks for the update.
So the install command seems to work as conda list shows the right binary:

pytorch                   1.12.0          py3.9_cuda11.6_cudnn8.3.2_0

but you have multiple PyTorch binaries installed where the one installed via pip seems to use the CUDA 10.2 runtime and is an older version:

torch                             1.11.0

Make sure to either uninstall all PyTorch binaries from the current environment or create a new environment and install the right binary there.

1 Like

Ahhh that is interesting. I now realized, that when I create a fresh conda environment, torch is already installed. If you have time, can you tell me how that is possible? Thank you so much already for the really fast help!

Are you only creating it or also activating the new environment? If you are indeed activating it, I guess conda might reuse the packages installed in the base environment so in this case you might consider keeping the base env clean and empty.

If you are indeed activating it, I guess conda might reuse the packages installed in the base environment I also thought so. I uninstalled everything torch related with pip and conda from the base environment. Still having the same issue as in the beginning.
In the fresh environment pip list | grep torch and conda list | grep torch both show nothing. After solely the command

conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge

pip list | grep torch shows

torch                             1.11.0
torchaudio                        0.12.0
torchio                           0.18.75
torchvision                       0.12.0

and conda list | grep torch shows

ffmpeg                    4.3                  hf484d3e_0    pytorch
pytorch                   1.12.0          py3.9_cuda11.6_cudnn8.3.2_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
torchaudio                0.12.0               py39_cu116    pytorch
torchvision               0.13.0               py39_cu116    pytorch

Somehow it seems that via pip the old version 1.11.0 of torch is also installed in the background but i don’t know why.

Made it work now. In the fresh conda environment I used pip instead of conda to install torch via

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

No it seems to work fine. Still I am a little worried about that whole thing haha

Good to hear it’s working with pip but it’s still weird that conda is apparently installing two different wheels.
Could you post the install log from the conda install command so that I could check what’s happening?
I guess conda might be seeing a “wrong” dependency and might install torch==1.11.0 additionally for unknown reasons.

So this is the install log from conda install:

## Package Plan ##

  environment location: /home/habring/anaconda3/envs/conf_pred

  added / updated specs:
    - cudatoolkit=11.6
    - pytorch
    - torchaudio
    - torchvision


The following NEW packages will be INSTALLED:

  _libgcc_mutex      conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
  _openmp_mutex      conda-forge/linux-64::_openmp_mutex-4.5-2_kmp_llvm
  blas               pkgs/main/linux-64::blas-1.0-mkl
  brotlipy           conda-forge/linux-64::brotlipy-0.7.0-py39hb9d737c_1004
  bzip2              conda-forge/linux-64::bzip2-1.0.8-h7f98852_4
  ca-certificates    conda-forge/linux-64::ca-certificates-2022.6.15-ha878542_0
  certifi            conda-forge/linux-64::certifi-2022.6.15-py39hf3d152e_0
  cffi               conda-forge/linux-64::cffi-1.15.1-py39he91dace_0
  charset-normalizer conda-forge/noarch::charset-normalizer-2.1.0-pyhd8ed1ab_0
  cryptography       conda-forge/linux-64::cryptography-37.0.4-py39hd97740a_0
  cudatoolkit        conda-forge/linux-64::cudatoolkit-11.6.0-hecad31d_10
  ffmpeg             pytorch/linux-64::ffmpeg-4.3-hf484d3e_0
  freetype           conda-forge/linux-64::freetype-2.10.4-h0708190_1
  giflib             conda-forge/linux-64::giflib-5.2.1-h36c2ea0_2
  gmp                conda-forge/linux-64::gmp-6.2.1-h58526e2_0
  gnutls             conda-forge/linux-64::gnutls-3.6.13-h85f3911_1
  idna               conda-forge/noarch::idna-3.3-pyhd8ed1ab_0
  jpeg               conda-forge/linux-64::jpeg-9e-h166bdaf_2
  lame               conda-forge/linux-64::lame-3.100-h7f98852_1001
  lcms2              conda-forge/linux-64::lcms2-2.12-hddcbb42_0
  ld_impl_linux-64   conda-forge/linux-64::ld_impl_linux-64-2.36.1-hea4e1c9_2
  lerc               conda-forge/linux-64::lerc-3.0-h9c3ff4c_0
  libdeflate         conda-forge/linux-64::libdeflate-1.12-h166bdaf_0
  libffi             conda-forge/linux-64::libffi-3.4.2-h7f98852_5
  libgcc-ng          conda-forge/linux-64::libgcc-ng-12.1.0-h8d9b700_16
  libiconv           conda-forge/linux-64::libiconv-1.17-h166bdaf_0
  libnsl             conda-forge/linux-64::libnsl-2.0.0-h7f98852_0
  libpng             conda-forge/linux-64::libpng-1.6.37-h753d276_3
  libstdcxx-ng       conda-forge/linux-64::libstdcxx-ng-12.1.0-ha89aaad_16
  libtiff            conda-forge/linux-64::libtiff-4.4.0-hc85c160_1
  libuuid            conda-forge/linux-64::libuuid-2.32.1-h7f98852_1000
  libwebp            conda-forge/linux-64::libwebp-1.2.3-h522a892_0
  libwebp-base       conda-forge/linux-64::libwebp-base-1.2.3-h166bdaf_0
  libxcb             conda-forge/linux-64::libxcb-1.13-h7f98852_1004
  libzlib            conda-forge/linux-64::libzlib-1.2.12-h166bdaf_2
  llvm-openmp        conda-forge/linux-64::llvm-openmp-14.0.4-he0ac6c6_0
  lz4-c              conda-forge/linux-64::lz4-c-1.9.3-h9c3ff4c_1
  mkl                conda-forge/linux-64::mkl-2021.4.0-h8d4b97c_729
  mkl-service        conda-forge/linux-64::mkl-service-2.4.0-py39h7e14d7c_0
  mkl_fft            conda-forge/linux-64::mkl_fft-1.3.1-py39h0c7bc48_1
  mkl_random         conda-forge/linux-64::mkl_random-1.2.2-py39hde0f152_0
  ncurses            conda-forge/linux-64::ncurses-6.3-h27087fc_1
  nettle             conda-forge/linux-64::nettle-3.6-he412f7d_0
  numpy              pkgs/main/linux-64::numpy-1.22.3-py39he7a7128_0
  numpy-base         pkgs/main/linux-64::numpy-base-1.22.3-py39hf524024_0
  openh264           conda-forge/linux-64::openh264-2.1.1-h780b84a_0
  openjpeg           conda-forge/linux-64::openjpeg-2.4.0-hb52868f_1
  openssl            conda-forge/linux-64::openssl-1.1.1q-h166bdaf_0
  pillow             conda-forge/linux-64::pillow-9.2.0-py39hae2aec6_0
  pip                conda-forge/noarch::pip-22.1.2-pyhd8ed1ab_0
  pthread-stubs      conda-forge/linux-64::pthread-stubs-0.4-h36c2ea0_1001
  pycparser          conda-forge/noarch::pycparser-2.21-pyhd8ed1ab_0
  pyopenssl          conda-forge/noarch::pyopenssl-22.0.0-pyhd8ed1ab_0
  pysocks            conda-forge/linux-64::pysocks-1.7.1-py39hf3d152e_5
  python             conda-forge/linux-64::python-3.9.13-h9a8a25e_0_cpython
  python_abi         conda-forge/linux-64::python_abi-3.9-2_cp39
  pytorch            pytorch/linux-64::pytorch-1.12.0-py3.9_cuda11.6_cudnn8.3.2_0
  pytorch-mutex      pytorch/noarch::pytorch-mutex-1.0-cuda
  readline           conda-forge/linux-64::readline-8.1.2-h0f457ee_0
  requests           conda-forge/noarch::requests-2.28.1-pyhd8ed1ab_0
  setuptools         conda-forge/linux-64::setuptools-63.2.0-py39hf3d152e_0
  six                conda-forge/noarch::six-1.16.0-pyh6c4a22f_0
  sqlite             conda-forge/linux-64::sqlite-3.39.1-h4ff8645_0
  tbb                conda-forge/linux-64::tbb-2021.5.0-h924138e_1
  tk                 conda-forge/linux-64::tk-8.6.12-h27826a3_0
  torchaudio         pytorch/linux-64::torchaudio-0.12.0-py39_cu116
  torchvision        pytorch/linux-64::torchvision-0.13.0-py39_cu116
  typing_extensions  conda-forge/noarch::typing_extensions-4.3.0-pyha770c72_0
  tzdata             conda-forge/noarch::tzdata-2022a-h191b570_0
  urllib3            conda-forge/noarch::urllib3-1.26.10-pyhd8ed1ab_0
  wheel              conda-forge/noarch::wheel-0.37.1-pyhd8ed1ab_0
  xorg-libxau        conda-forge/linux-64::xorg-libxau-1.0.9-h7f98852_0
  xorg-libxdmcp      conda-forge/linux-64::xorg-libxdmcp-1.1.3-h7f98852_0
  xz                 conda-forge/linux-64::xz-5.2.5-h516909a_1
  zlib               conda-forge/linux-64::zlib-1.2.12-h166bdaf_2
  zstd               conda-forge/linux-64::zstd-1.5.2-h8a70e8d_2


Proceed ([y]/n)? Y

Preparing transaction: done
Verifying transaction: done
Executing transaction: - By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html

done

I noticed another interesting issue now: As explained, everything worked out yesterday when I used pip to install torch in my new environment. But today I noticed that in this case torch was also installed in the base environment. I don’t know if this is the expected behavior. When I then uninstalled torch in the base environment it was also removed from the new environment.

Yeah, I don’t understand what exactly is going on.
The conda install command doesn’t show any other torch==1.11.0 dependency and I don’t know why your base env is polluted.
I have 32 different environments right now and haven’t seen this strange behavior before.
Btw. I’m using conda==4.13.0 with miniforge in case that matters.

Werid… Sometimes it be like that
Thank you very much for the help anyway!