Could not load library libcudnn_cnn_infer.so.8

mjj · March 17, 2023, 8:46am

I’m trying to replicate Pedro Cuenca’s code on accelerated diffusers using pytorch 2.0 + CUDA 11.8 on Ubuntu 22.04. However, I’m getting the following error:

Could not load library libcudnn_cnn_infer.so.8. Error: libnvrtc.so: cannot open shared object file: No such file or directory

Here is the relevant code:

from diffusers import StableDiffusionPipeline
from diffusers.models.cross_attention import AttnProcessor2_0
import diffusers
import torch
print('diffusers version', diffusers.__version__)
print('torch version', torch.__version__)
print('cudnn version', torch.backends.cudnn.version())

pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2")
pipe.to("cuda")
pipe.unet.set_attn_processor(AttnProcessor2_0())
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]

I’m on diffusers version 0.14.0, torch version 2.0.0+cu118 and cudnn version 8700.

I’m using a conda env. Tried installing pytorch using both pip and conda. The required library (libcudnn_cnn_infer.so.8) does seem to be properly installed within the virtual env. I also tried this before running the script, but I still get the same error:

export LD_LIBRARY_PATH=~/miniconda3/envs/stable-diffusion/lib/python3.10/site-packages/torch/lib/

ptrblck · March 17, 2023, 9:28am

I cannot reproduce the issue using 2.0.0+cu118 with a 3090 and get:

...
>>> pipe.unet.set_attn_processor(AttnProcessor2_0())
>>> prompt = "a photo of an astronaut riding a horse on mars"
>>> image = pipe(prompt).images[0]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00,  3.00it/s]
>>> image
<PIL.Image.Image image mode=RGB size=768x768 at 0x7F1E306F7FA0>

while I assume the last posted line of code raises the error for you?
Just to double check, these are my versions:

>>> print('diffusers version', diffusers.__version__)
diffusers version 0.14.0
>>> print('torch version', torch.__version__)
torch version 2.0.0+cu118
>>> print('cudnn version', torch.backends.cudnn.version())
cudnn version 8700

and I needed to install some dependencies such as transformers.

mjj · March 17, 2023, 9:48am

Thanks @ptrblck! I fixed it by installing cuda-11-8 from Nvidia’s developer repo:

sudo apt install cuda-11-8

I was wrongly understanding that the missing library was libcudnn_cnn_infer.so.8, which is correctly installed by the pytorch pip/conda packages, but it is in fact libnvrtc.so, which is not. After installing it I can now run the example code.

If you think this is a packaging bug I can submit an issue at github.

ptrblck · March 17, 2023, 10:01am

Yes, the issue in your environment is raised by a missing libnvrtc.so, which is shipped in python3.8/site-packages/torch/lib/libnvrtc-672ee683.so.11.2, but I guess libcudnn misses this dependency assuming cuDNN’s runtime fusion API is used.
Let me check my workflow with LD_DEBUG to see where my environment is loading this library from and I can follow up with an issue and fix if needed in the pytorch/builder repository.

mjj · March 17, 2023, 11:15am

You’re right, the library is there. Symlinking libnvrtc-672ee683.so.11.2 to libnvrtc.so (and apt removing cuda-11-8) worked:

cd yourenv/lib/python3.10/site-packages/torch/lib
ln -s libnvrtc-672ee683.so.11.2 libnvrtc.so

ptrblck · March 17, 2023, 7:44pm

We are tracking the issue here.

Prabhat_Singh · April 18, 2023, 4:50pm

Hi, I would like to know if there’s any solution or workaround ? for the above issue as I am also facing the same issue right now , here are the specs

>>> print('torch version', torch.__version__)
torch version 2.0.0+cu117
>>> print('cudnn version', torch.backends.cudnn.version())
cudnn version 8500
>>> torch.cuda.is_available()
True
>>> 

I have cuda toolkit 12.1 installed currently

mjj · April 18, 2023, 6:22pm

Did you try symlinking libnvrtc.so → libnvrtc-*.so.* as mentioned above?

cd yourenv/lib/python3.10/site-packages/torch/lib
ln -s libnvrtc-672ee683.so.11.2 libnvrtc.so

Prabhat_Singh · April 19, 2023, 4:10am

I just have this one library which resembles the libcaffe2_nvrtc.so, should I symlink this to libnvrtc.so ?

ptrblck · April 19, 2023, 6:09am

The linked issue is already closed and fixed via this PR so installing the nightly binaries should fix the issue in case you have trouble applying the posted workaround.

Prabhat_Singh · April 19, 2023, 6:42am

Thank You it’s working with the nightly build, but I have this error now

    _check_cuda_version(compiler_name, compiler_version)
  File "/home/prabhat/.local/share/virtualenvs/GroundingSam-IWDf2mQY/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 388, in _check_cuda_version
    raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError: 
The detected CUDA version (12.1) mismatches the version that was used to compile
PyTorch (11.8). Please make sure to use the same CUDA versions.

ptrblck · April 19, 2023, 6:45am

It seems you are trying to build a custom CUDA extension, which requires you to use the same locally installed CUDA toolkit as the one used to build the binaries (11.8).

Prabhat_Singh · April 19, 2023, 7:00am

Yess , thank you
I am installing 11.8 toolkit for the same .
I thought based on this issue I wouldn’t need to

ptrblck · April 19, 2023, 8:02am

As I’ve also mentioned in the linked topic your locally installed CUDA toolkit will be used if you build PyTorch from source or a custom CUDA extension as is apparently the case causing your error.
Pure PyTorch code does not need a locally installed CUDA toolkit.