Large CUDA dependencies in PyTorch 1.13 (conda installation)

Hello,

I have tried searching on the internet but no one seems to report this issue. When installing PyTorch 1.13, there are a lot of CUDA dependencies (apart from cudatoolkit) which are quite large, making the conda environment huge. I’m not sure if all of those dependencies are necessary, as it seems previous versions of PyTorch don’t need them?

Following the official installation instruction

conda install pytorch torchvision pytorch-cuda=11.7 -c pytorch -c nvidia

Output from conda

The following NEW packages will be INSTALLED:

  blas               pkgs/main/linux-64::blas-1.0-mkl
  cuda               nvidia/linux-64::cuda-11.7.1-0
  cuda-cccl          nvidia/linux-64::cuda-cccl-11.7.91-0
  cuda-command-line~ nvidia/linux-64::cuda-command-line-tools-11.7.1-0
  cuda-compiler      nvidia/linux-64::cuda-compiler-11.7.1-0
  cuda-cudart        nvidia/linux-64::cuda-cudart-11.7.99-0
  cuda-cudart-dev    nvidia/linux-64::cuda-cudart-dev-11.7.99-0
  cuda-cuobjdump     nvidia/linux-64::cuda-cuobjdump-11.7.91-0
  cuda-cupti         nvidia/linux-64::cuda-cupti-11.7.101-0
  cuda-cuxxfilt      nvidia/linux-64::cuda-cuxxfilt-11.7.91-0
  cuda-demo-suite    nvidia/linux-64::cuda-demo-suite-11.8.86-0
  cuda-documentation nvidia/linux-64::cuda-documentation-11.8.86-0
  cuda-driver-dev    nvidia/linux-64::cuda-driver-dev-11.7.99-0
  cuda-gdb           nvidia/linux-64::cuda-gdb-11.8.86-0
  cuda-libraries     nvidia/linux-64::cuda-libraries-11.7.1-0
  cuda-libraries-dev nvidia/linux-64::cuda-libraries-dev-11.7.1-0
  cuda-memcheck      nvidia/linux-64::cuda-memcheck-11.8.86-0
  cuda-nsight        nvidia/linux-64::cuda-nsight-11.8.86-0
  cuda-nsight-compu~ nvidia/linux-64::cuda-nsight-compute-11.8.0-0
  cuda-nvcc          nvidia/linux-64::cuda-nvcc-11.7.99-0
  cuda-nvdisasm      nvidia/linux-64::cuda-nvdisasm-11.8.86-0
  cuda-nvml-dev      nvidia/linux-64::cuda-nvml-dev-11.7.91-0
  cuda-nvprof        nvidia/linux-64::cuda-nvprof-11.8.87-0
  cuda-nvprune       nvidia/linux-64::cuda-nvprune-11.7.91-0
  cuda-nvrtc         nvidia/linux-64::cuda-nvrtc-11.7.99-0
  cuda-nvrtc-dev     nvidia/linux-64::cuda-nvrtc-dev-11.7.99-0
  cuda-nvtx          nvidia/linux-64::cuda-nvtx-11.7.91-0
  cuda-nvvp          nvidia/linux-64::cuda-nvvp-11.8.87-0
  cuda-runtime       nvidia/linux-64::cuda-runtime-11.7.1-0
  cuda-sanitizer-api nvidia/linux-64::cuda-sanitizer-api-11.8.86-0
  cuda-toolkit       nvidia/linux-64::cuda-toolkit-11.7.1-0
  cuda-tools         nvidia/linux-64::cuda-tools-11.7.1-0
  cuda-visual-tools  nvidia/linux-64::cuda-visual-tools-11.7.1-0
  gds-tools          nvidia/linux-64::gds-tools-1.4.0.31-0
  giflib             pkgs/main/linux-64::giflib-5.2.1-h7b6447c_0
  intel-openmp       pkgs/main/linux-64::intel-openmp-2021.4.0-h06a4308_3561
  jpeg               pkgs/main/linux-64::jpeg-9e-h7f8727e_0
  lcms2              pkgs/main/linux-64::lcms2-2.12-h3be6417_0
  lerc               pkgs/main/linux-64::lerc-3.0-h295c915_0
  libcublas          nvidia/linux-64::libcublas-11.11.3.6-0
  libcublas-dev      nvidia/linux-64::libcublas-dev-11.11.3.6-0
  libcufft           nvidia/linux-64::libcufft-10.9.0.58-0
  libcufft-dev       nvidia/linux-64::libcufft-dev-10.9.0.58-0
  libcufile          nvidia/linux-64::libcufile-1.4.0.31-0
  libcufile-dev      nvidia/linux-64::libcufile-dev-1.4.0.31-0
  libcurand          nvidia/linux-64::libcurand-10.3.0.86-0
  libcurand-dev      nvidia/linux-64::libcurand-dev-10.3.0.86-0
  libcusolver        nvidia/linux-64::libcusolver-11.4.1.48-0
  libcusolver-dev    nvidia/linux-64::libcusolver-dev-11.4.1.48-0
  libcusparse        nvidia/linux-64::libcusparse-11.7.5.86-0
  libcusparse-dev    nvidia/linux-64::libcusparse-dev-11.7.5.86-0
  libdeflate         pkgs/main/linux-64::libdeflate-1.8-h7f8727e_5
  libnpp             nvidia/linux-64::libnpp-11.8.0.86-0
  libnpp-dev         nvidia/linux-64::libnpp-dev-11.8.0.86-0
  libnvjpeg          nvidia/linux-64::libnvjpeg-11.9.0.86-0
  libnvjpeg-dev      nvidia/linux-64::libnvjpeg-dev-11.9.0.86-0
  libtiff            pkgs/main/linux-64::libtiff-4.4.0-hecacb30_0
  libwebp            pkgs/main/linux-64::libwebp-1.2.4-h11a3e52_0
  libwebp-base       pkgs/main/linux-64::libwebp-base-1.2.4-h5eee18b_0
  mkl                pkgs/main/linux-64::mkl-2021.4.0-h06a4308_640
  mkl-service        pkgs/main/linux-64::mkl-service-2.4.0-py38h7f8727e_0
  mkl_fft            pkgs/main/linux-64::mkl_fft-1.3.1-py38hd3c417c_0
  mkl_random         pkgs/main/linux-64::mkl_random-1.2.2-py38h51133e4_0
  nsight-compute     nvidia/linux-64::nsight-compute-2022.3.0.22-0
  numpy              pkgs/main/linux-64::numpy-1.23.3-py38h14f4228_1
  numpy-base         pkgs/main/linux-64::numpy-base-1.23.3-py38h31eccc5_1
  pillow             pkgs/main/linux-64::pillow-9.2.0-py38hace64e9_1
  pytorch            pytorch/linux-64::pytorch-1.13.0-py3.8_cuda11.7_cudnn8.5.0_0
  pytorch-cuda       pytorch/noarch::pytorch-cuda-11.7-h67b0de4_0
  pytorch-mutex      pytorch/noarch::pytorch-mutex-1.0-cuda
  torchvision        pytorch/linux-64::torchvision-0.14.0-py38_cu117
  typing_extensions  pkgs/main/linux-64::typing_extensions-4.3.0-py38h06a4308_0

Listing libraries size with du -h -s $(conda info --base)/envs/diffusers/lib/* | sort -hr (diffusers is my environment name)

2.9G    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/python3.10
916M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcublasLt_static.a
548M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcublasLt.so.11.11.3.6
308M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcusparse_static.a
300M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcusolver_static.a
294M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcufft_static_nocallback.a
286M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcusolver.so.11.4.1.48
281M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcufft_static.a
267M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcusparse.so.11.7.5.86
267M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcufft.so.10.9.0.58
175M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcusolverMg.so.11.4.1.48
120M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcublas_static.a
102M    /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libnppif_static.a
99M     /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libnppif.so.11.8.0.86
97M     /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcurand_static.a
97M     /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcurand.so.10.3.0.86
91M     /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libcublas.so.11.11.3.6
72M     /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libmkl_core.so.1
68M     /home/students/acct3001_02/miniconda3/envs/diffusers/lib/libnvrtc_static.a

There are also nsight-compute (at /home/students/acct3001_02/miniconda3/envs/diffusers/nsight-compute) taking another 1GB. I probably haven’t counted all the new CUDA libraries. The total size of CUDA libraries easily add up to 5GB+. This is much larger than before.

My questions would be:

  1. Is this a constraint by newer version of CUDA i.e. CUDA 11.7 ?
  2. If it is not, it is possible to exclude unnecessary CUDA libraries for the default installation?

Thank you!

1 Like
  1. No, these libs are not specific to CUDA 11.7, but the build process was switched from the unsupported cudatoolkit to the cuda conda binary.

  2. Yes, I believe the PyTorch metapackage could filter our more unnecessary libraries before installing cuda.

Thank you for your quick reply!

  1. What do you mean by “unsupported cudatoolkit”? cudatoolkit is no longer supported?
  2. I suppose this can only be done by the PyTorch team, the users cannot directly do this? Then will the PyTorch team plan to filter out unnecessary libraries in future releases?

I tried digging into package dependencies of the conda packages and found the following

  • pytorch-cuda-11.7 only depends on cuda-11.7, while other unnecessary libraries are runtime constrains
  • However, cuda-11.7 depends on (cuda-demo-suite, cuda-runtime, and cuda-toolkit). Among them, cuda-runtime and cuda-toolkit depend on cuda-libraries, which are all of the unnecessary libraries.

I guess in that case, the limitation is imposed by how NVIDIA package their CUDA libraries.

Request for pytorch packagers: could pytorch-cuda-11.7 only requires cuda-toolkit or cuda-runtime or whatever is called the cuda runtime is named by NVidia? in order du decrease the disk space utilisation?

The nightly binaries should have already reduced the dependencies. Did you check these or could you do it, please?

I have been checking the stable pytorch 1.13.1 version and indeed the nightly has greatly improved the disk usage!

conda install -y pytorch pytorch-cuda=11.7 -c pytorch -c nvidia from continuumio/miniconda3 docker image:

## Package Plan ##

  environment location: /opt/conda/envs/py310-pytorch-1.13.1

  added / updated specs:
    - pytorch
    - pytorch-cuda=11.7


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    _libgcc_mutex-0.1          |             main           3 KB
    _openmp_mutex-5.1          |            1_gnu          21 KB
    blas-1.0                   |              mkl           6 KB
    bzip2-1.0.8                |       h7b6447c_0          78 KB
    certifi-2022.12.7          |  py310h06a4308_0         150 KB
    cuda-11.7.1                |                0           1 KB  nvidia
    cuda-cccl-11.7.91          |                0         1.2 MB  nvidia
    cuda-command-line-tools-11.7.1|                0           1 KB  nvidia
    cuda-compiler-11.7.1       |                0           1 KB  nvidia
    cuda-cudart-11.7.99        |                0         194 KB  nvidia
    cuda-cudart-dev-11.7.99    |                0         1.1 MB  nvidia
    cuda-cuobjdump-11.7.91     |                0         158 KB  nvidia
    cuda-cupti-11.7.101        |                0        22.9 MB  nvidia
    cuda-cuxxfilt-11.7.91      |                0         293 KB  nvidia
    cuda-demo-suite-12.1.55    |                0         5.0 MB  nvidia
    cuda-documentation-12.1.55 |                0          89 KB  nvidia
    cuda-driver-dev-11.7.99    |                0          16 KB  nvidia
    cuda-gdb-12.1.55           |                0         5.3 MB  nvidia
    cuda-libraries-11.7.1      |                0           1 KB  nvidia
    cuda-libraries-dev-11.7.1  |                0           2 KB  nvidia
    cuda-memcheck-11.8.86      |                0         168 KB  nvidia
    cuda-nsight-12.1.55        |                0       113.6 MB  nvidia
    cuda-nsight-compute-12.1.0 |                0           1 KB  nvidia
    cuda-nvcc-11.7.99          |                0        42.7 MB  nvidia
    cuda-nvdisasm-12.1.55      |                0        47.9 MB  nvidia
    cuda-nvml-dev-11.7.91      |                0          80 KB  nvidia
    cuda-nvprof-12.1.55        |                0         4.8 MB  nvidia
    cuda-nvprune-11.7.91       |                0          64 KB  nvidia
    cuda-nvrtc-11.7.99         |                0        17.3 MB  nvidia
    cuda-nvrtc-dev-11.7.99     |                0        16.9 MB  nvidia
    cuda-nvtx-11.7.91          |                0          57 KB  nvidia
    cuda-nvvp-12.1.55          |                0       114.5 MB  nvidia
    cuda-runtime-11.7.1        |                0           1 KB  nvidia
    cuda-sanitizer-api-12.1.55 |                0        16.7 MB  nvidia
    cuda-toolkit-11.7.1        |                0           1 KB  nvidia
    cuda-tools-11.7.1          |                0           1 KB  nvidia
    cuda-visual-tools-11.7.1   |                0           1 KB  nvidia
    flit-core-3.6.0            |     pyhd3eb1b0_0          42 KB
    gds-tools-1.6.0.25         |                0        40.9 MB  nvidia
    intel-openmp-2023.0.0      |   h9e868ea_25371        15.1 MB
    ld_impl_linux-64-2.38      |       h1181459_1         654 KB
    libcublas-11.10.3.66       |                0       286.1 MB  nvidia
    libcublas-dev-11.10.3.66   |                0       296.4 MB  nvidia
    libcufft-10.7.2.124        |       h4fbf590_0        93.6 MB  nvidia
    libcufft-dev-10.7.2.124    |       h98a8f43_0       197.3 MB  nvidia
    libcufile-1.6.0.25         |                0         763 KB  nvidia
    libcufile-dev-1.6.0.25     |                0          13 KB  nvidia
    libcurand-10.3.2.56        |                0        51.7 MB  nvidia
    libcurand-dev-10.3.2.56    |                0         449 KB  nvidia
    libcusolver-11.4.0.1       |                0        78.7 MB  nvidia
    libcusolver-dev-11.4.0.1   |                0        55.9 MB  nvidia
    libcusparse-11.7.4.91      |                0       151.1 MB  nvidia
    libcusparse-dev-11.7.4.91  |                0       309.5 MB  nvidia
    libffi-3.4.2               |       h6a678d5_6         136 KB
    libgcc-ng-11.2.0           |       h1234567_1         5.3 MB
    libgomp-11.2.0             |       h1234567_1         474 KB
    libnpp-11.7.4.75           |                0       129.3 MB  nvidia
    libnpp-dev-11.7.4.75       |                0       126.6 MB  nvidia
    libnvjpeg-11.8.0.2         |                0         2.2 MB  nvidia
    libnvjpeg-dev-11.8.0.2     |                0         1.9 MB  nvidia
    libstdcxx-ng-11.2.0        |       h1234567_1         4.7 MB
    libuuid-1.41.5             |       h5eee18b_0          27 KB
    mkl-2023.0.0               |   h6d00ec8_25399       171.4 MB
    nsight-compute-2023.1.0.15 |                0       770.2 MB  nvidia
    pip-22.3.1                 |  py310h06a4308_0         2.8 MB
    pytorch-1.13.1             |py3.10_cuda11.7_cudnn8.5.0_0        1.14 GB  pytorch
    pytorch-cuda-11.7          |       h67b0de4_1           3 KB  pytorch
    pytorch-mutex-1.0          |             cuda           3 KB  pytorch
    readline-8.2               |       h5eee18b_0         357 KB
    tbb-2021.7.0               |       hdb19cb5_0         1.6 MB
    tk-8.6.12                  |       h1ccaba5_0         3.0 MB
    typing_extensions-4.4.0    |  py310h06a4308_0          46 KB
    tzdata-2022g               |       h04d1e81_0         114 KB
    zlib-1.2.13                |       h5eee18b_0         103 KB
    ------------------------------------------------------------
                                           Total:        4.28 GB

compared to conda install -y pytorch pytorch-cuda=11.7 -c pytorch-nightly -c nvidia

## Package Plan ##

  environment location: /opt/conda/envs/py310-pytorch-nightly

  added / updated specs:
    - pytorch
    - pytorch-cuda=11.7


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    _libgcc_mutex-0.1          |             main           3 KB
    _openmp_mutex-5.1          |            1_gnu          21 KB
    blas-1.0                   |              mkl           6 KB
    bzip2-1.0.8                |       h7b6447c_0          78 KB
    certifi-2022.9.24          |  py311h06a4308_0         155 KB
    cuda-cudart-11.7.99        |                0         194 KB  nvidia
    cuda-cupti-11.7.101        |                0        22.9 MB  nvidia
    cuda-libraries-11.7.1      |                0           1 KB  nvidia
    cuda-nvrtc-11.7.99         |                0        17.3 MB  nvidia
    cuda-nvtx-11.7.91          |                0          57 KB  nvidia
    cuda-runtime-11.7.1        |                0           1 KB  nvidia
    filelock-3.9.0             |          py311_0          20 KB  pytorch-nightly
    intel-openmp-2023.0.0      |   h9e868ea_25371        15.1 MB
    ld_impl_linux-64-2.38      |       h1181459_1         654 KB
    libcublas-11.10.3.66       |                0       286.1 MB  nvidia
    libcufft-10.7.2.124        |       h4fbf590_0        93.6 MB  nvidia
    libcufile-1.6.0.25         |                0         763 KB  nvidia
    libcurand-10.3.2.56        |                0        51.7 MB  nvidia
    libcusolver-11.4.0.1       |                0        78.7 MB  nvidia
    libcusparse-11.7.4.91      |                0       151.1 MB  nvidia
    libffi-3.4.2               |       h6a678d5_6         136 KB
    libgcc-ng-11.2.0           |       h1234567_1         5.3 MB
    libgomp-11.2.0             |       h1234567_1         474 KB
    libnpp-11.7.4.75           |                0       129.3 MB  nvidia
    libnvjpeg-11.8.0.2         |                0         2.2 MB  nvidia
    libstdcxx-ng-11.2.0        |       h1234567_1         4.7 MB
    libuuid-1.41.5             |       h5eee18b_0          27 KB
    mkl-2023.0.0               |   h6d00ec8_25399       171.4 MB
    mpmath-1.2.1               |          py311_0         1.2 MB  pytorch-nightly
    pip-22.2.2                 |  py311h06a4308_0         2.9 MB
    python-3.11.0              |       h7a1cb2a_2        32.6 MB
    pytorch-2.0.0.dev20230301  |py3.11_cuda11.7_cudnn8.5.0_0        1.20 GB  pytorch-nightly
    pytorch-cuda-11.7          |       h778d358_3           7 KB  pytorch-nightly
    pytorch-mutex-1.0          |             cuda           3 KB  pytorch-nightly
    readline-8.2               |       h5eee18b_0         357 KB
    setuptools-65.5.0          |  py311h06a4308_0         1.4 MB
    sympy-1.11.1               |          py311_0        18.8 MB  pytorch-nightly
    tbb-2021.7.0               |       hdb19cb5_0         1.6 MB
    tk-8.6.12                  |       h1ccaba5_0         3.0 MB
    torchtriton-2.0.0+b8b470bc59|            py311        62.7 MB  pytorch-nightly
    typing_extensions-4.1.1    |     pyh06a4308_0          28 KB
    tzdata-2022g               |       h04d1e81_0         114 KB
    wheel-0.37.1               |     pyhd3eb1b0_0          33 KB
    zlib-1.2.13                |       h5eee18b_0         103 KB
    ------------------------------------------------------------
                                           Total:        2.33 GB

Do you have some ETA fot the diet version to appear? 4.28 GB down to 2.33 GB is quite significant :smiley:

Best regards

Tru

The change landed in the nightly release a few weeks ago and the next stable 2.0 release is now at RC2 while still being finalized with critical bug fixes. I would guess it might take ~2 more weeks of testing and fixing important issues until the release, but that’s just my guess. :wink:

1 Like