Pytorch for cuda 10.2

silverant · January 1, 2020, 4:13pm

Hi all,

this is my first post and I am new to AI and RL… please apologize

I use ubuntu 18.04.1
I follow installation guide from pytorch and followed cuda instalation guide nvidia

I notice that I cannot choose cuda 10.2 but that nvidia download site only offer cuda 10.2.

And after following above guide, torch.cuda.is_available() give false result.

here is my nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    On   | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P0    N/A /  N/A |    355MiB /  3020MiB |     10%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1105      G   /usr/lib/xorg/Xorg                            27MiB |
|    0      1211      G   /usr/bin/gnome-shell                          47MiB |
|    0      1453      G   /usr/lib/xorg/Xorg                           138MiB |
|    0      1609      G   /usr/bin/gnome-shell                         136MiB |
|    0      4046      G   /usr/lib/firefox/firefox                       1MiB |
+-----------------------------------------------------------------------------+

and now when I type conda install pytorch torchvision cudatoolkit=10.1 -c pytorch it says
# All requested packages already installed.

do you have any suggestion to solve this please ( I want to use cuda)…

thanks in advance

best regards,

*edit nvidia link from runfilelocal to deblocal

ptrblck · January 1, 2020, 7:32pm

The binaries ship with their own CUDA, cudnn, etc. so that you don’t need to install these libs locally, if you are fine with the provided versions.

Could you uninstall PyTorch in your conda environment and reinstall it (with cudatoolkit=10.1)?

If you want to use e.g. CUDA10.2, you would have to install it locally and build PyTorch from source.

silverant · January 2, 2020, 12:18am

Dear ptrblck,

it works !

I follow your instruction.

uninstall pytorch

$ conda remove pytorch

## Package Plan ##

  environment location: /home/silverant/anaconda3/envs/rl_gym_book

  removed specs:
    - pytorch


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    mkl-2018.0.3               |                1       126.9 MB
    mkl_fft-1.0.6              |   py35h7dd41cf_0         134 KB
    mkl_random-1.0.1           |   py35h4414c95_1         313 KB
    numpy-1.15.2               |   py35h1d66e8a_0          46 KB
    numpy-base-1.15.2          |   py35h81de0dd_0         3.4 MB
    tbb-2019.8                 |       hfd86e86_0         1.1 MB
    tbb4py-2018.0.5            |   py35h6bb024c_0         201 KB
    ------------------------------------------------------------
                                           Total:       132.1 MB

The following NEW packages will be INSTALLED:

  mkl_fft            pkgs/main/linux-64::mkl_fft-1.0.6-py35h7dd41cf_0
  mkl_random         pkgs/main/linux-64::mkl_random-1.0.1-py35h4414c95_1
  numpy-base         pkgs/main/linux-64::numpy-base-1.15.2-py35h81de0dd_0
  tbb                pkgs/main/linux-64::tbb-2019.8-hfd86e86_0
  tbb4py             pkgs/main/linux-64::tbb4py-2018.0.5-py35h6bb024c_0

The following packages will be REMOVED:

  cudatoolkit-10.1.243-h6bb024c_0
  libtiff-4.1.0-h2733197_0
  olefile-0.46-py35_0
  pillow-5.2.0-py35heded4f4_0
  pytorch-1.3.1-py3.5_cuda10.1.243_cudnn7.6.3_0
  torchvision-0.4.2-py35_cu101
  zstd-1.3.7-h0b5b093_0

The following packages will be UPDATED:

  numpy                               1.14.2-py35hdbf6ddf_0 --> 1.15.2-py35h1d66e8a_0

The following packages will be DOWNGRADED:

  mkl                                            2019.4-243 --> 2018.0.3-1


Proceed ([y]/n)? y

then install pytorch again

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

## Package Plan ##

  environment location: /home/silverant/anaconda3/envs/rl_gym_book

  added / updated specs:
    - cudatoolkit=10.1
    - pytorch
    - torchvision


The following NEW packages will be INSTALLED:

  cudatoolkit        pkgs/main/linux-64::cudatoolkit-10.1.243-h6bb024c_0
  libtiff            pkgs/main/linux-64::libtiff-4.1.0-h2733197_0
  olefile            pkgs/main/linux-64::olefile-0.46-py35_0
  pillow             pkgs/main/linux-64::pillow-5.2.0-py35heded4f4_0
  pytorch            pytorch/linux-64::pytorch-1.3.1-py3.5_cuda10.1.243_cudnn7.6.3_0
  torchvision        pytorch/linux-64::torchvision-0.4.2-py35_cu101
  zstd               pkgs/main/linux-64::zstd-1.3.7-h0b5b093_0


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

and after that, the cuda.is_available() return True

$ python -c 'import torch;print(torch.cuda.is_available())'
True

Thank you very much sir.

best regards,

mcskwayrd · February 1, 2020, 4:15am

It looks like I’m going to need to install the whole thing from source, i.e. switching to 10.1 isn’t going to work for me.

The instructions for installing from source also mention “# Add LAPACK support for the GPU if needed” but then rely on prebuilt packages for magma that don’t include CUDA 10.2.

Two questions (e.g. for @ptrblck :

What happens if we don’t install magma – do we get any GPU support, or is it just not as lighting-fast as it could be?
If we’re going to build magma from source, is there any recommended way to do it, e.g. selecting the mkl + gcc version of the magma example make.inc?

Thanks.

ptrblck · February 1, 2020, 5:38am

Linear algebra methods rely on magma (e.g. here) so you won’t be able to use them. However, a lot of models don’t use these methods, so as long as you don’t need to e.g. calculate the log determinant of your weight parameter, you should be fine.
I would recommend to stick to pytorch/builder for the magma build.
For CUDA10.2 you shouldn’t need any changes in the build script besides the CUDA version change, but let me know, if you get stuck. Also, you could have a look at the NGC PyTorch container, which ships with CUDA10.2.

PavlosTiritiris · February 1, 2020, 6:21pm

Hi , i did the same thing with you but it didn’t work.
My OS is Windows10.My cuda version 10.2

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:32:27_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.2, V10.2.89

Remove pytorch:

(base) C:\Users\pavlo>conda remove pytorch
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: C:\Users\pavlo\Miniconda3

  removed specs:
    - pytorch


The following packages will be REMOVED:

  blas-1.0-mkl
  freetype-2.9.1-ha9979f8_1
  icc_rt-2019.0.0-h0cc432a_1
  intel-openmp-2019.4-245
  jpeg-9b-hb83a4c4_2
  libpng-1.6.37-h2a8f88b_0
  libtiff-4.1.0-h56a325e_0
  mkl-2019.4-245
  mkl-service-2.3.0-py37hb782905_0
  mkl_fft-1.0.15-py37h14836fe_0
  mkl_random-1.1.0-py37h675688f_0
  ninja-1.9.0-py37h74a9793_0
  numpy-1.18.1-py37h93ca92e_0
  numpy-base-1.18.1-py37hc3f5095_1
  olefile-0.46-py37_0
  pillow-7.0.0-py37hcc1f983_0
  pytorch-1.4.0-py3.7_cpu_0
  tk-8.6.8-hfa6e2cd_0
  torchvision-0.5.0-py37_cpu
  xz-5.2.4-h2fa13f4_4
  zlib-1.2.11-h62dcd97_3
  zstd-1.3.7-h508b16e_0


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Install pytorch again:

(base) C:\Users\pavlo>conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: C:\Users\pavlo\Miniconda3

  added / updated specs:
    - cudatoolkit=10.1
    - pytorch
    - torchvision


The following NEW packages will be INSTALLED:

  blas               pkgs/main/win-64::blas-1.0-mkl
  freetype           pkgs/main/win-64::freetype-2.9.1-ha9979f8_1
  icc_rt             pkgs/main/win-64::icc_rt-2019.0.0-h0cc432a_1
  intel-openmp       pkgs/main/win-64::intel-openmp-2019.4-245
  jpeg               pkgs/main/win-64::jpeg-9b-hb83a4c4_2
  libpng             pkgs/main/win-64::libpng-1.6.37-h2a8f88b_0
  libtiff            pkgs/main/win-64::libtiff-4.1.0-h56a325e_0
  mkl                pkgs/main/win-64::mkl-2019.4-245
  mkl-service        pkgs/main/win-64::mkl-service-2.3.0-py37hb782905_0
  mkl_fft            pkgs/main/win-64::mkl_fft-1.0.15-py37h14836fe_0
  mkl_random         pkgs/main/win-64::mkl_random-1.1.0-py37h675688f_0
  ninja              pkgs/main/win-64::ninja-1.9.0-py37h74a9793_0
  numpy              pkgs/main/win-64::numpy-1.18.1-py37h93ca92e_0
  numpy-base         pkgs/main/win-64::numpy-base-1.18.1-py37hc3f5095_1
  olefile            pkgs/main/win-64::olefile-0.46-py37_0
  pillow             pkgs/main/win-64::pillow-7.0.0-py37hcc1f983_0
  pytorch            pytorch/win-64::pytorch-1.4.0-py3.7_cpu_0
  tk                 pkgs/main/win-64::tk-8.6.8-hfa6e2cd_0
  torchvision        pytorch/win-64::torchvision-0.5.0-py37_cpu
  xz                 pkgs/main/win-64::xz-5.2.4-h2fa13f4_4
  zlib               pkgs/main/win-64::zlib-1.2.11-h62dcd97_3
  zstd               pkgs/main/win-64::zstd-1.3.7-h508b16e_0


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

(base) C:\Users\pavlo>python -c "import torch;print(torch.cuda.is_available())"
False

Do you have any ideas of what is going on here? Here is my conda list and i wonder why i have a [cpuonly] on pytorch package.

(base) C:\Users\pavlo>conda list
# packages in environment at C:\Users\pavlo\Miniconda3:
#
# Name                    Version                   Build  Channel
absl-py                   0.8.1                    pypi_0    pypi
asn1crypto                1.3.0                    py37_0
astor                     0.8.0                    pypi_0    pypi
atari-py                  0.2.6                    pypi_0    pypi
blas                      1.0                         mkl
ca-certificates           2020.1.1                      0
certifi                   2019.11.28               py37_0
cffi                      1.13.2           py37h7a1dbc1_0
chardet                   3.0.4                 py37_1003
cloudpickle               1.2.2                    pypi_0    pypi
conda                     4.8.1                    py37_0
conda-package-handling    1.6.0            py37h62dcd97_0
console_shortcut          0.1.1                         3
cpuonly                   1.0                           0    pytorch
cryptography              2.8              py37h7a1dbc1_0
cudatoolkit               10.1.243             h74a9793_0
cycler                    0.10.0                   pypi_0    pypi
cython                    0.29.13                  pypi_0    pypi
freetype                  2.9.1                ha9979f8_1
future                    0.18.0                   pypi_0    pypi
gast                      0.2.2                    pypi_0    pypi
google-pasta              0.1.7                    pypi_0    pypi
grpcio                    1.24.1                   pypi_0    pypi
gym                       0.15.3                   pypi_0    pypi
h5py                      2.10.0                   pypi_0    pypi
icc_rt                    2019.0.0             h0cc432a_1
idna                      2.8                      py37_0
intel-openmp              2019.4                      245
jpeg                      9b                   hb83a4c4_2
keras                     2.3.1                    pypi_0    pypi
keras-applications        1.0.8                    pypi_0    pypi
keras-preprocessing       1.1.0                    pypi_0    pypi
keras-rl                  0.4.2                    pypi_0    pypi
kiwisolver                1.1.0                    pypi_0    pypi
libpng                    1.6.37               h2a8f88b_0
libtiff                   4.1.0                h56a325e_0
lxml                      4.4.1                    pypi_0    pypi
markdown                  3.1.1                    pypi_0    pypi
matplotlib                3.1.1                    pypi_0    pypi
menuinst                  1.4.16           py37he774522_0
mkl                       2019.4                      245
mkl-service               2.3.0            py37hb782905_0
mkl_fft                   1.0.15           py37h14836fe_0
mkl_random                1.1.0            py37h675688f_0
ninja                     1.9.0            py37h74a9793_0
numpy                     1.17.2                   pypi_0    pypi
numpy-base                1.18.1           py37hc3f5095_1
olefile                   0.46                     py37_0
opencv-python             4.1.1.26                 pypi_0    pypi
openssl                   1.1.1d               he774522_3
opt-einsum                3.1.0                    pypi_0    pypi
pillow                    7.0.0            py37hcc1f983_0
pip                       20.0.2                   py37_1
powershell_shortcut       0.0.1                         2
protobuf                  3.10.0                   pypi_0    pypi
psutil                    5.6.3                    pypi_0    pypi
pycosat                   0.6.3            py37he774522_0
pycparser                 2.19                     py37_0
pyglet                    1.3.2                    pypi_0    pypi
pyopenssl                 19.1.0                   py37_0
pyparsing                 2.4.3                    pypi_0    pypi
pyprind                   2.11.2                   pypi_0    pypi
pysocks                   1.7.1                    py37_0
python                    3.7.3                h8c8aaf0_1
python-dateutil           2.8.1                    pypi_0    pypi
pytils                    0.3                      pypi_0    pypi
pytorch                   1.4.0               py3.7_cpu_0  [cpuonly]  pytorch
pywin32                   227              py37he774522_1
pyyaml                    5.1.2                    pypi_0    pypi
requests                  2.22.0                   py37_1
rlpyt                     0.1.1.dev0                dev_0    <develop>
ruamel_yaml               0.15.87          py37he774522_0
scipy                     1.3.1                    pypi_0    pypi
setuptools                45.1.0                   py37_0
six                       1.14.0                   py37_0
sqlite                    3.30.1               he774522_0
tb-nightly                1.14.0a20190603          pypi_0    pypi
tensorboard               2.0.0                    pypi_0    pypi
tensorflow                2.0.0b0                  pypi_0    pypi
tensorflow-estimator      2.0.0                    pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
tf-estimator-nightly      1.14.0.dev2019060501          pypi_0    pypi
tk                        8.6.8                hfa6e2cd_0
tools                     0.1.9                    pypi_0    pypi
torchvision               0.5.0                  py37_cpu  [cpuonly]  pytorch
tqdm                      4.42.0                     py_0
urllib3                   1.25.8                   py37_0
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_1
werkzeug                  0.16.0                   pypi_0    pypi
wheel                     0.34.1                   py37_0
win_inet_pton             1.1.0                    py37_0
wincertstore              0.2                      py37_0
wrapt                     1.11.2                   pypi_0    pypi
xz                        5.2.4                h2fa13f4_4
yaml                      0.1.7                hc54c509_2
zlib                      1.2.11               h62dcd97_3
zstd                      1.3.7                h508b16e_0

ptrblck · February 1, 2020, 8:11pm

Try to Update conda, create a new conda environment, and rerun the install command.
Sometimes conda and pip seem to have some problems finding the right version.

PavlosTiritiris · February 1, 2020, 8:53pm

I did it but i found that i have older version of nvidia driver i guess.

>>> a = torch.tensor([]).cuda()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\pavlo\Miniconda3\envs\TorchEnv\lib\site-packages\torch\cuda\__init__.py", line 194, in _lazy_init
    _check_driver()
  File "C:\Users\pavlo\Miniconda3\envs\TorchEnv\lib\site-packages\torch\cuda\__init__.py", line 102, in _check_driver
    raise AssertionError("""
AssertionError:
The NVIDIA driver on your system is too old (found version 9010).
Please update your GPU driver by downloading and installing a new
version from the URL: http://www.nvidia.com/Download/index.aspx
Alternatively, go to: https://pytorch.org to install
a PyTorch version that has been compiled with your version
of the CUDA driver.

ptrblck · February 2, 2020, 1:10am

Try to update the driver and reinstall PyTorch.

Shisho_Sama · February 2, 2020, 3:29am

I have Pytorch 1.4 installed with Cuda 10.2 and cudnn 7.6.x and its working just fine! (Ubuntu 18.04)
There should be something else that prevents you from successful execution imho!

ptrblck · February 2, 2020, 3:35am

I think @PavlosTiritiris is currently trying to install the binaries with CUDA10.1, so creating a new topic might be a better idea to keep this topic clean, as it’s related to a CUDA10.2 installation.

PavlosTiritiris · February 5, 2020, 11:52pm

It turned out that i had cuda compute capability < 3 and pytorch doesn’t support it, so i tried with a machine with cuda capability 5 and the installation was succesfull.

Lornatang · February 6, 2020, 4:33am

In my opinion. CUDA10.2 works well. No exception was found from my PIP installation.

Emre_Bayram · February 21, 2020, 7:19pm

this saved me bunch of time also thanks!

Phantomfancy · March 17, 2020, 7:29am

So does this mean that installing cudatoolkit=10.1(and pytorch) for cuda version 10.2 can solve the problem?
I met the same problem and tried the suggestion for a server with cuda 10.2:

conda install pytorch torchvision cudatoolkit=10.1

However when I tried moving a tensor to cuda an error occurred:

AssertionError: Torch not compiled with CUDA enabled

What’s wrong with that? Actually I thought for the server installed with cuda 10.2, it should be reasonable to install pytorch with cudatoolkit=10.2 rather than 10.1. So I’m also confused with the previous suggestion.
Hope for your reply!
PS: I have also tried installing pytorch with cudatoolkit=10.2 and got the same error as before…

ptrblck · March 17, 2020, 7:37am

Could you post the log from the installation, please?

Phantomfancy · March 17, 2020, 7:56am

Oh I changed the channel to pytorch and the problem solved!

wassimseif · March 25, 2020, 5:55pm

Hey,

Does that mean each conda env will have it’s own cuda binaries ?

ptrblck · March 26, 2020, 1:58am

Each conda env should use their own set of installed libraries.
I’m not sure, if some libs are reused from the base environment, but you can definitely install different PyTorch builds in separate environments.

priyanka_prusty · April 29, 2020, 5:57pm

Hi, I have Cuda 10.2 and I have installed pytorch with cudatoolkit=10.2. I am getting error AssertionError: Torch not compiled with CUDA enabled. I am getting same error for cudatoolkit=10.1. Could you tell me what is going wrong?