AndreasH
(Andreas Habring)
July 19, 2022, 8:52am
1
Hello everybody,
PyTorch seems to use the wrong cuda version.
I create a fresh conda environment with
conda create -n myenv
Then in this environment I install torch via
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
Afterwards if I start python in this environment and import torch, torch.version yields ‘1.11.0+cu102’.
Other info:
nvidia-smi shows
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:26:00.0 On | N/A |
| 43% 36C P8 9W / 125W | 1420MiB / 6144MiB | 4% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1655 G /usr/lib/Xorg 174MiB |
| 0 N/A N/A 1736 G /usr/bin/gnome-shell 64MiB |
| 0 N/A N/A 11012 C /usr/bin/python 933MiB |
| 0 N/A N/A 30207 G /usr/lib/firefox/firefox 166MiB |
| 0 N/A N/A 100867 G ...b/thunderbird/thunderbird 64MiB |
| 0 N/A N/A 149129 G ...AAAAAAAAA= --shared-files 10MiB |
+-----------------------------------------------------------------------------+
and nvcc --version shows
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Thu_Feb_10_18:23:41_PST_2022
Cuda compilation tools, release 11.6, V11.6.112
Build cuda_11.6.r11.6/compiler.30978841_0
Thanks for any help.
Could you post the install log showing which PyTorch binary was picked as well as the output of pip list | grep torch
and conda list | grep torch
?
AndreasH
(Andreas Habring)
July 19, 2022, 8:59am
3
That response was quick
Install log shows:
pytorch pytorch/linux-64::pytorch-1.12.0-py3.9_cuda11.6_cudnn8.3.2_0
pip list | grep torch shows
torch 1.11.0
torchaudio 0.12.0
torchio 0.18.75
torchvision 0.12.0
and conda list | grep torch
ffmpeg 4.3 hf484d3e_0 pytorch
pytorch 1.12.0 py3.9_cuda11.6_cudnn8.3.2_0 pytorch
pytorch-mutex 1.0 cuda pytorch
torchaudio 0.12.0 py39_cu116 pytorch
torchvision 0.13.0 py39_cu116 pytorch
Thanks for the update.
So the install command seems to work as conda list
shows the right binary:
pytorch 1.12.0 py3.9_cuda11.6_cudnn8.3.2_0
but you have multiple PyTorch binaries installed where the one installed via pip
seems to use the CUDA 10.2 runtime and is an older version:
torch 1.11.0
Make sure to either uninstall all PyTorch binaries from the current environment or create a new environment and install the right binary there.
1 Like
AndreasH
(Andreas Habring)
July 19, 2022, 9:14am
5
Ahhh that is interesting. I now realized, that when I create a fresh conda environment, torch is already installed. If you have time, can you tell me how that is possible? Thank you so much already for the really fast help!
Are you only creating it or also activating the new environment? If you are indeed activating it, I guess conda might reuse the packages installed in the base
environment so in this case you might consider keeping the base
env clean and empty.
AndreasH
(Andreas Habring)
July 19, 2022, 9:46am
7
If you are indeed activating it, I guess conda might reuse the packages installed in the base environment
I also thought so. I uninstalled everything torch related with pip and conda from the base environment. Still having the same issue as in the beginning.
In the fresh environment pip list | grep torch
and conda list | grep torch
both show nothing. After solely the command
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
pip list | grep torch
shows
torch 1.11.0
torchaudio 0.12.0
torchio 0.18.75
torchvision 0.12.0
and conda list | grep torch
shows
ffmpeg 4.3 hf484d3e_0 pytorch
pytorch 1.12.0 py3.9_cuda11.6_cudnn8.3.2_0 pytorch
pytorch-mutex 1.0 cuda pytorch
torchaudio 0.12.0 py39_cu116 pytorch
torchvision 0.13.0 py39_cu116 pytorch
Somehow it seems that via pip the old version 1.11.0 of torch is also installed in the background but i don’t know why.
AndreasH
(Andreas Habring)
July 19, 2022, 9:51am
8
Made it work now. In the fresh conda environment I used pip instead of conda to install torch via
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
No it seems to work fine. Still I am a little worried about that whole thing haha
Good to hear it’s working with pip
but it’s still weird that conda
is apparently installing two different wheels.
Could you post the install log from the conda install
command so that I could check what’s happening?
I guess conda
might be seeing a “wrong” dependency and might install torch==1.11.0
additionally for unknown reasons.
AndreasH
(Andreas Habring)
July 20, 2022, 7:19am
10
So this is the install log from conda install
:
## Package Plan ##
environment location: /home/habring/anaconda3/envs/conf_pred
added / updated specs:
- cudatoolkit=11.6
- pytorch
- torchaudio
- torchvision
The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-2_kmp_llvm
blas pkgs/main/linux-64::blas-1.0-mkl
brotlipy conda-forge/linux-64::brotlipy-0.7.0-py39hb9d737c_1004
bzip2 conda-forge/linux-64::bzip2-1.0.8-h7f98852_4
ca-certificates conda-forge/linux-64::ca-certificates-2022.6.15-ha878542_0
certifi conda-forge/linux-64::certifi-2022.6.15-py39hf3d152e_0
cffi conda-forge/linux-64::cffi-1.15.1-py39he91dace_0
charset-normalizer conda-forge/noarch::charset-normalizer-2.1.0-pyhd8ed1ab_0
cryptography conda-forge/linux-64::cryptography-37.0.4-py39hd97740a_0
cudatoolkit conda-forge/linux-64::cudatoolkit-11.6.0-hecad31d_10
ffmpeg pytorch/linux-64::ffmpeg-4.3-hf484d3e_0
freetype conda-forge/linux-64::freetype-2.10.4-h0708190_1
giflib conda-forge/linux-64::giflib-5.2.1-h36c2ea0_2
gmp conda-forge/linux-64::gmp-6.2.1-h58526e2_0
gnutls conda-forge/linux-64::gnutls-3.6.13-h85f3911_1
idna conda-forge/noarch::idna-3.3-pyhd8ed1ab_0
jpeg conda-forge/linux-64::jpeg-9e-h166bdaf_2
lame conda-forge/linux-64::lame-3.100-h7f98852_1001
lcms2 conda-forge/linux-64::lcms2-2.12-hddcbb42_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.36.1-hea4e1c9_2
lerc conda-forge/linux-64::lerc-3.0-h9c3ff4c_0
libdeflate conda-forge/linux-64::libdeflate-1.12-h166bdaf_0
libffi conda-forge/linux-64::libffi-3.4.2-h7f98852_5
libgcc-ng conda-forge/linux-64::libgcc-ng-12.1.0-h8d9b700_16
libiconv conda-forge/linux-64::libiconv-1.17-h166bdaf_0
libnsl conda-forge/linux-64::libnsl-2.0.0-h7f98852_0
libpng conda-forge/linux-64::libpng-1.6.37-h753d276_3
libstdcxx-ng conda-forge/linux-64::libstdcxx-ng-12.1.0-ha89aaad_16
libtiff conda-forge/linux-64::libtiff-4.4.0-hc85c160_1
libuuid conda-forge/linux-64::libuuid-2.32.1-h7f98852_1000
libwebp conda-forge/linux-64::libwebp-1.2.3-h522a892_0
libwebp-base conda-forge/linux-64::libwebp-base-1.2.3-h166bdaf_0
libxcb conda-forge/linux-64::libxcb-1.13-h7f98852_1004
libzlib conda-forge/linux-64::libzlib-1.2.12-h166bdaf_2
llvm-openmp conda-forge/linux-64::llvm-openmp-14.0.4-he0ac6c6_0
lz4-c conda-forge/linux-64::lz4-c-1.9.3-h9c3ff4c_1
mkl conda-forge/linux-64::mkl-2021.4.0-h8d4b97c_729
mkl-service conda-forge/linux-64::mkl-service-2.4.0-py39h7e14d7c_0
mkl_fft conda-forge/linux-64::mkl_fft-1.3.1-py39h0c7bc48_1
mkl_random conda-forge/linux-64::mkl_random-1.2.2-py39hde0f152_0
ncurses conda-forge/linux-64::ncurses-6.3-h27087fc_1
nettle conda-forge/linux-64::nettle-3.6-he412f7d_0
numpy pkgs/main/linux-64::numpy-1.22.3-py39he7a7128_0
numpy-base pkgs/main/linux-64::numpy-base-1.22.3-py39hf524024_0
openh264 conda-forge/linux-64::openh264-2.1.1-h780b84a_0
openjpeg conda-forge/linux-64::openjpeg-2.4.0-hb52868f_1
openssl conda-forge/linux-64::openssl-1.1.1q-h166bdaf_0
pillow conda-forge/linux-64::pillow-9.2.0-py39hae2aec6_0
pip conda-forge/noarch::pip-22.1.2-pyhd8ed1ab_0
pthread-stubs conda-forge/linux-64::pthread-stubs-0.4-h36c2ea0_1001
pycparser conda-forge/noarch::pycparser-2.21-pyhd8ed1ab_0
pyopenssl conda-forge/noarch::pyopenssl-22.0.0-pyhd8ed1ab_0
pysocks conda-forge/linux-64::pysocks-1.7.1-py39hf3d152e_5
python conda-forge/linux-64::python-3.9.13-h9a8a25e_0_cpython
python_abi conda-forge/linux-64::python_abi-3.9-2_cp39
pytorch pytorch/linux-64::pytorch-1.12.0-py3.9_cuda11.6_cudnn8.3.2_0
pytorch-mutex pytorch/noarch::pytorch-mutex-1.0-cuda
readline conda-forge/linux-64::readline-8.1.2-h0f457ee_0
requests conda-forge/noarch::requests-2.28.1-pyhd8ed1ab_0
setuptools conda-forge/linux-64::setuptools-63.2.0-py39hf3d152e_0
six conda-forge/noarch::six-1.16.0-pyh6c4a22f_0
sqlite conda-forge/linux-64::sqlite-3.39.1-h4ff8645_0
tbb conda-forge/linux-64::tbb-2021.5.0-h924138e_1
tk conda-forge/linux-64::tk-8.6.12-h27826a3_0
torchaudio pytorch/linux-64::torchaudio-0.12.0-py39_cu116
torchvision pytorch/linux-64::torchvision-0.13.0-py39_cu116
typing_extensions conda-forge/noarch::typing_extensions-4.3.0-pyha770c72_0
tzdata conda-forge/noarch::tzdata-2022a-h191b570_0
urllib3 conda-forge/noarch::urllib3-1.26.10-pyhd8ed1ab_0
wheel conda-forge/noarch::wheel-0.37.1-pyhd8ed1ab_0
xorg-libxau conda-forge/linux-64::xorg-libxau-1.0.9-h7f98852_0
xorg-libxdmcp conda-forge/linux-64::xorg-libxdmcp-1.1.3-h7f98852_0
xz conda-forge/linux-64::xz-5.2.5-h516909a_1
zlib conda-forge/linux-64::zlib-1.2.12-h166bdaf_2
zstd conda-forge/linux-64::zstd-1.5.2-h8a70e8d_2
Proceed ([y]/n)? Y
Preparing transaction: done
Verifying transaction: done
Executing transaction: - By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html
done
I noticed another interesting issue now: As explained, everything worked out yesterday when I used pip to install torch in my new environment. But today I noticed that in this case torch was also installed in the base environment. I don’t know if this is the expected behavior. When I then uninstalled torch in the base environment it was also removed from the new environment.
Yeah, I don’t understand what exactly is going on.
The conda install
command doesn’t show any other torch==1.11.0
dependency and I don’t know why your base
env is polluted.
I have 32 different environments right now and haven’t seen this strange behavior before.
Btw. I’m using conda==4.13.0
with miniforge
in case that matters.
AndreasH
(Andreas Habring)
July 20, 2022, 10:01am
12
Werid… Sometimes it be like that
Thank you very much for the help anyway!