Pytorch 1.8.1+cu111 package size and memory usage issue

I am trying to upgrade torch 1.8.1 to use CUDA11.1, but I found 2 issues compared with the cuda10.2 one:

  • The package Size

    • The cuda10.2 one is about 900M, while the cuda11.1 one is about 1900M. Is this expected?
  • Runtime memory usage when no CUDA related function is called

    • I tested by running python3 -c 'import torch' for both version. The memory usage had big difference. For cuda10.2 version, it only uses about 180M, but for cuda11.1, it uses about 1.0G. Below is the loaded library for each version in gdb, looks like for cuda11.1 one, even no cuda related function is called, it will still load those cuda kernels like libtorch_cuda_cu.so. Is this an issue?

      For cuda11.1 one:

      For cuda10.2 one:

BTW, the cuda10.2 one is installed with pip install torch==1.8.1, while the cuda11.1 one is installed with pip install torch==1.8.1+cu111 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html.

  1. Yes, since the CUDA libs increased in size.
  2. Unsure, if you are referring to GPU memory, but the CUDA context would use between ~600-1000MB device memory, so the 180MB seem to low for CUDA10.2.

For 2, the memory I mean the CPU memory. If user only run import tensor, it should not load any CUDA related libraries into CPU memory per my understanding. I compared the shared libraries of this 2 version. For cuda11.1 one, the libtorch_cuda_cu.so and libtroch_cuda_cpp.so are new and both are very big, not sure whether it is related or not. @albanD , any idea?

For torch with cuda11.1:

For torch with cuda10.2