I have a model that uses torchaudio.transforms.MelSpectrogram and torchaudio.models.Conformer. It works in torch==2.0.0, torchaudio==2.0.1, and torchdata==0.6.0. However, it does not work in the latest packages - torch==2.1.0 torchaudio==2.1.0 torchdata==0.7.0. This is an issue for me as I need stuff from the later packages.
The problem that arises is this error:
Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda/lib64/libcudnn_cnn_train.so.8: undefined symbol: _ZN5cudnn3cnn5infer22queryClusterPropertiesERPhS3_, version libcudnn_cnn_infer.so.8
Traceback (most recent call last):
...
File "/venvs/bipu2/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/venvs/bipu2/lib/python3.10/site-packages/torch/autograd/__init__.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: GET was unable to find an engine to execute this computation
It comes up after running loss.backward()
and so it’s only in train.
What’s going on?