Could not load library in new version

I have a model that uses torchaudio.transforms.MelSpectrogram and torchaudio.models.Conformer. It works in torch==2.0.0, torchaudio==2.0.1, and torchdata==0.6.0. However, it does not work in the latest packages - torch==2.1.0 torchaudio==2.1.0 torchdata==0.7.0. This is an issue for me as I need stuff from the later packages.

The problem that arises is this error:

Could not load library Error: /usr/local/cuda/lib64/ undefined symbol: _ZN5cudnn3cnn5infer22queryClusterPropertiesERPhS3_, version
Traceback (most recent call last):
  File "/venvs/bipu2/lib/python3.10/site-packages/torch/", line 492, in backward
  File "/venvs/bipu2/lib/python3.10/site-packages/torch/autograd/", line 251, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: GET was unable to find an engine to execute this computation

It comes up after running loss.backward() and so it’s only in train.

What’s going on?

PyTorch ships with its own CUDA dependencies (including cuDNN) and the error message points to a locally installed cuDNN version. Either uninstall it as a workaround or remove it from the LD_LIBRARY_PATH to allow PyTorch to use its own version.

1 Like

Cool, thanks, that seems to have worked. I appreciate the fast response.