CppExtension loading complains undefined symbol

Hi,

I made a CppExtension, and the building had no problem, but when I import it from python, I’ve got an undefined symbol error:

Python 3.7.1 (default, Oct 23 2018, 17:15:52) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_asr._latgen_lib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/jinserk/.pyenv/versions/3.7.1/lib/python3.7/site-packages/torch_asr/_latgen_lib.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE
>>> 

I’m using gcc 8.2.0 on Ubuntu 18.10. Interestingly, I had no problem on the system of gcc 4.8.5 on CentOS7 with the exact same extension and setup.py.

I wonder whether the ATen shared libraries are automatically loaded by PyTorch or not. Do I need to add PyTorch directory to LD_LIBRARY_PATH manually? Obviously I didn’t do it on CentOS7. If I has to, why Ubuntu requires to do so while CentOS7 doesn’t?

1 Like

Hi,

That feels like some ABI incompatibility between the compiler that was used for compiling pytorch and the one for compiling the extension.
I’m sure @goldsborough can tell you why !

Thanks for reply, @albanD !
Oh, I’m using torch_nightly on Ubuntu, but a custom built pytorch on CentOS. How to check the ABI compatibility between the pytorch binaries and the extension I made? If they don’t have the compatibility, how to fix it with the official build of torch_nightly?

So the ABI compatibility should be checked automatically and a warning raised if there is a risk of non-compatibility. I guess this check is faulty? @goldsborough will let you know for sure.
In the meantime, a temporary solution would be to try with a different gcc version. I am not sure which one is the best as it says 4.9+ in the warning but 8.2 might be too much?

You need to import torch before you import your library.

10 Likes

Hi @goldsborough,

I did it but still I have some problems:

$ python
Python 3.7.1 (default, Oct 23 2018, 17:15:52) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torch.utils.cpp_extension
>>> torch.utils.cpp_extension.check_compiler_abi_compatibility("g++")
True
>>> torch.utils.cpp_extension.check_compiler_abi_compatibility("gcc")
True
>>> import torch_asr._latgen_lib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/jinserk/.pyenv/versions/3.7.1/lib/python3.7/site-packages/torch_asr/_latgen_lib.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3fst12ReadFstKaldiESs
>>> 

I’ve tried gcc 7.3.0 but the error of undefined symbol is the same.

Hi,

This missing symbol is not from pytorch _ZN3fst12ReadFstKaldiESs fst::ReadFst. Do you use another shared library that you don’t link / load before loading your binary?

Yes. I use Kaldi, a ASR framework, and link the shared libs from the project. However, when I put the lib directory into LD_LIBRARY_PATH, the error about the Kaldi has gone, but still it complains about the ATen library.

Here is my project setup.py:

Hi, have you figured out the reason?

Have you solved this?

I might have a similar issue, in my case I am using

#include <ATen/native/ConvUtils.h> // for cudnn_conv_suggest_memory_format
[...]
auto dataType = at::native::getCudnnDataType(input);
[...]

The extension builds fine (either with setup.py and load/JIT), however, when importing I get an error

ImportError: /home/eduardoj/miniconda3/lib/python3.9/site-packages/jordan_g-0.0.0-py3.9-linux-x86_64.egg/cudnn_convolution.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN2at6native16getCudnnDataTypeERKNS_6TensorE

Do I need to add any path to $LD_LIBRARY_PATH ?