Undefined symbol when import lltm cpp extension

I cloned the extension-cpp repo, and setup installed the cpp extension for lltm, but when run the following instructions:

import torch
import lltm

error thrown:

lltm_cpp.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

btw, I did compile pytorch and cpp extension using gcc 5.4.

System info:
CUDA 9.0
CUDNN 7.1.4
Python 3.6.4
PyTorch 1.0
miniconda 3
GCC 5.4.0

Finally, I solved the problem:
Its caused by _GLIBCXX_USE_CXX11_ABI=1 when compile pytorch from source. That means the c++ std::string abi doesn’t match between building pytorch source and building cpp extensions.

There are two way to solve this problem:

  1. build cpp extensions with -D_GLIBCXX_USE_CXX11_ABI=1.
  2. build pytorch with -D_GLIBCXX_USE_CXX11_ABI=0.

Below shows how I figure it out:

  1. first I checked the newly installed v1.0.0 pytorch .so files in ~/anaconda3/lib/python3.6/site-packages/torch, which is my pytorch path.
find  ~/anaconda3/lib/python3.6/site-packages/torch.bak/ -name "*.so" -exec bash -c "nm -D {} | grep SourceLocation" \;

The outputs:

                 U _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                       
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                       
                 U _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                    
000000000071c4e0 T _ZN5torch3jit6tracer20recordSourceLocationEPNS0_4NodeE
000000000071c4f0 T _ZN5torch3jit6tracer23setRecordSourceLocationEPFvPNS0_4NodeEE
000000000071be60 T _ZN5torch3jit6tracer27defaultRecordSourceLocationEPNS0_4NodeE
000000000060ce30 W _ZNSt14_Function_base13_Base_managerIZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS2_14SourceLocationEEN3c108ArrayRefIPNS2_5ValueEEENS8_IhEESB_EUlRSt6vectorINS7_6IValueESaISE_EEE_E10_M_ma
nagerERSt9_Any_dataRKSK_St18_Manager_operation
000000000060ce20 W _ZNSt17_Function_handlerIFiRSt6vectorIN3c106IValueESaIS2_EEEZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS8_14SourceLocationEENS1_8ArrayRefIPNS8_5ValueEEENSD_IhEESG_EUlS5_E_E9_M_invokeERK
St9_Any_dataS5_
0000000000d9aa30 V _ZTIN5torch3jit14SourceLocationE
0000000000da23a0 V _ZTIZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS0_14SourceLocationEEN3c108ArrayRefIPNS0_5ValueEEENS6_IhEES9_EUlRSt6vectorINS5_6IValueESaISC_EEE_                                        
000000000097e5a0 V _ZTSN5torch3jit14SourceLocationE
00000000009ccaa0 V _ZTSZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS0_14SourceLocationEEN3c108ArrayRefIPNS0_5ValueEEENS6_IhEES9_EUlRSt6vectorINS5_6IValueESaISC_EEE_                                        
0000000000da1b58 V _ZTVN5torch3jit14SourceLocationE
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                       
                 U _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                    
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                       
                 U _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                    
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE                                                                                                       
                 U _ZN3c107Warning19set_warning_handlerEPFvRKNS_14SourceLocationEPKcE
                 U _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
                 U _ZN5torch3jit6tracer20recordSourceLocationEPNS0_4NodeE
                 U _ZN5torch3jit6tracer23setRecordSourceLocationEPFvPNS0_4NodeEE
000000000036f230 T _ZN5torch3jit6tracer26pythonRecordSourceLocationEPNS0_4NodeE
0000000000374000 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_destroyEv
0000000000373f30 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv
0000000000374070 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE14_M_get_deleterERKSt9type_info
0000000000373ff0 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EED0Ev
0000000000373f20 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EED1Ev
0000000000373f20 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EED2Ev
00000000009ec4b8 V _ZTIN5torch3jit14SourceLocationE
00000000009ef470 V _ZTIN5torch3jit20StringSourceLocationE
00000000009ef4d8 V _ZTISt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE
0000000000686700 V _ZTSN5torch3jit14SourceLocationE
00000000006a3a00 V _ZTSN5torch3jit20StringSourceLocationE
00000000006a3be0 V _ZTSSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE
00000000009efd40 V _ZTVN5torch3jit14SourceLocationE
00000000009ef508 V _ZTVN5torch3jit20StringSourceLocationE
00000000009ef568 V _ZTVSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE
0000000000011f40 T _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
0000000000011f40 T _ZN3c105ErrorC2ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
000000000000ff00 T _ZN3c107Warning13print_warningERKNS_14SourceLocationEPKc
0000000000011060 T _ZN3c107Warning19set_warning_handlerEPFvRKNS_14SourceLocationEPKcE
0000000000011040 T _ZN3c107Warning4warnENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

See those __cxx11 symbols, they are CXX11_ABI.
2. To make sure that its indeed CXX11_ABI problem, I checked the early version, like v0.4.1; ran the same command:

                 U _ZN3c105ErrorC1ENS_14SourceLocationERKSs
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKSs
                 U _ZN3c107Warning4warnENS_14SourceLocationESs
000000000077f460 T _ZN5torch3jit6tracer20recordSourceLocationEPNS0_4NodeE
000000000077f470 T _ZN5torch3jit6tracer23setRecordSourceLocationEPFvPNS0_4NodeEE
000000000077f1b0 T _ZN5torch3jit6tracer27defaultRecordSourceLocationEPNS0_4NodeE
00000000006822a0 W _ZNSt14_Function_base13_Base_managerIZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS2_14SourceLocationEEN3c108ArrayRefIPNS2_5ValueEEENS8_IhEESB_EUlRSt6vectorINS7_6IValueESaISE_EEE_E10_M_managerERSt9_Any_dataRKSK_St18_Manager_operation
0000000000682080 W _ZNSt17_Function_handlerIFiRSt6vectorIN3c106IValueESaIS2_EEEZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS8_14SourceLocationEENS1_8ArrayRefIPNS8_5ValueEEENSD_IhEESG_EUlS5_E_E9_M_invokeERKSt9_Any_dataS5_
0000000000de27d0 V _ZTIN5torch3jit14SourceLocationE
0000000000dea930 V _ZTIZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS0_14SourceLocationEEN3c108ArrayRefIPNS0_5ValueEEENS6_IhEES9_EUlRSt6vectorINS5_6IValueESaISC_EEE_                                        
000000000099a840 V _ZTSN5torch3jit14SourceLocationE
00000000009f6bc0 V _ZTSZN5torch3jit8CodeImpl12insertAssignESt10shared_ptrINS0_14SourceLocationEEN3c108ArrayRefIPNS0_5ValueEEENS6_IhEES9_EUlRSt6vectorINS5_6IValueESaISC_EEE_                                        
0000000000de2a80 V _ZTVN5torch3jit14SourceLocationE
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKSs
                 U _ZN3c107Warning4warnENS_14SourceLocationESs
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKSs
                 U _ZN3c107Warning4warnENS_14SourceLocationESs
                 U _ZN3c105ErrorC1ENS_14SourceLocationERKSs
                 U _ZN3c107Warning19set_warning_handlerEPFvRKNS_14SourceLocationEPKcE
                 U _ZN3c107Warning4warnENS_14SourceLocationESs
                 U _ZN5torch3jit6tracer20recordSourceLocationEPNS0_4NodeE
                 U _ZN5torch3jit6tracer23setRecordSourceLocationEPFvPNS0_4NodeEE
000000000039c3d0 T _ZN5torch3jit6tracer26pythonRecordSourceLocationEPNS0_4NodeE
00000000003a0210 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_destroyEv                                                                             
00000000003a01d0 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv                                                                             
00000000003a03d0 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE14_M_get_deleterERKSt9type_info                                                            
00000000003a0220 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EED0Ev                                                                                       
00000000003a01c0 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EED1Ev                                                                                       
00000000003a01c0 W _ZNSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EED2Ev                                                                                       
0000000000a0bdb0 V _ZTIN5torch3jit14SourceLocationE
0000000000a0f750 V _ZTIN5torch3jit20StringSourceLocationE
0000000000a0f7d0 V _ZTISt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE                                                                                          
00000000006a0bc0 V _ZTSN5torch3jit14SourceLocationE
00000000006be5c0 V _ZTSN5torch3jit20StringSourceLocationE
00000000006be7c0 V _ZTSSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE                                                                                          
0000000000a0bf00 V _ZTVN5torch3jit14SourceLocationE
0000000000a0f820 V _ZTVN5torch3jit20StringSourceLocationE
0000000000a0f8a0 V _ZTVSt23_Sp_counted_ptr_inplaceIN5torch3jit20StringSourceLocationESaIS2_ELN9__gnu_cxx12_Lock_policyE2EE                                                                                          
0000000000012dd0 T _ZN3c105ErrorC1ENS_14SourceLocationERKSs
0000000000012dd0 T _ZN3c105ErrorC2ENS_14SourceLocationERKSs
00000000000108c0 T _ZN3c107Warning13print_warningERKNS_14SourceLocationEPKc
0000000000010cb0 T _ZN3c107Warning19set_warning_handlerEPFvRKNS_14SourceLocationEPKcE
0000000000010c90 T _ZN3c107Warning4warnENS_14SourceLocationESs

See, it does contain the symbol _ZN3c105ErrorC1ENS_14SourceLocationERKSs.

Then I rebuilt pytorch from source with export CFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0 $CFLAGS".

6 Likes

But even set export CFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0 $CFLAGS" , still wtih compiling errors.

I’m wondering how to compile pytorch 1.0 with _GLIBCXX_USE_CXX11_ABI=0 ?

PyTorch is already built with _GLIBCXX_USE_CXX11_ABI=0. So you have to also set this flag when compiling the C++ extension. In PyTorch 1.0 this should already be done for you, see https://github.com/pytorch/pytorch/blob/master/torch/utils/cpp_extension.py#L390.

The best way to solve this problem in any case is to compile Pytorch from source and use that same compiler for the extension. Then all problems go away.

3 Likes

You can see your pytorch whether built with CXX_ABI=1 or 0 via torch._C._GLIBCXX_USE_CXX11_ABI

1 Like

To check this is really helpful!

And it’s not easy for every one (including me) to compile pytorch from source code.
So I run this
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
to download from pytorch source, instead of run this:
conda install pytorch torchvision cudatoolkit=10.0

The former one has the torch._C._GLIBCXX_USE_CXX11_ABI = False but the latter one is True.

2 Likes

I’m facing the same problem when running this sample: https://github.com/NVIDIA/retinanet-examples

I build my pytorch from source, and have set the _GLIBCXX_USE_CXX11_ABI to be False. Then I built the extension, but when running it, this error occurs:

Traceback (most recent call last):
  File "/ssd2/exec/xiaoyunlong/code/retinanet-examples/retinanet/main.py", line 10, in <module>
    from retinanet import infer, train, utils
  File "/ssd2/exec/xiaoyunlong/anaconda3/lib/python3.7/site-packages/retinanet/infer.py", line 13, in <module>
    from .model import Model
  File "/ssd2/exec/xiaoyunlong/anaconda3/lib/python3.7/site-packages/retinanet/model.py", line 8, in <module>
    from ._C import Engine
ImportError: /ssd2/exec/xiaoyunlong/anaconda3/lib/python3.7/site-packages/retinanet/_C.so: undefined symbol: _ZN2cv8fastFreeEPv

I have checked the _GLIBCXX_USE_CXX11_ABI by torch._C._GLIBCXX_USE_CXX11_ABI and it outputs False.

Am I missing something? or did something wrong?

@IceSuger _ZN2cv8fastFreeEPv (a.k.a. cv::fastFree(void*)) is a symbol from opencv. Did you check that opencv is correctly linked? And is opencv compiled using the same compiler as pytorch?

I finally solve this ABI incompatible problem by using the same gcc (gcc 4.9 or higher) when building pytorch and extensions.

Thank you for your quick reply!
After a whole day trying to compile and install opencv but failed, I tried to use conda install for pytorch, opencv and pip install for the package retinanet(https://github.com/NVIDIA/retinanet-examples, which contains an pytorch c++ extension).
But the result is the same, undefined symbol: _ZN2cv8fastFreeEPv.

So, I guess, the pytorch, opencv and the extension must all be compiled from source with gcc of the same version?

Also, I am wondering how to check whether my opencv is correctly linked? (When I import cv2 in the python installed with anaconda3, no error occurs.)

Thank you again !

Maybe late but I hope it will help others to solve.
If you used pytorch-1.0.x, _GLIBCXX_USE_CXX11_ABI will be automatically set 0, please check here. Even though you export another _GLIBCXX_USE_CXX11_ABI in the shell or add extra compile argument in setup.py, these will be overridden. While in pytorch-1.1, _GLIBCXX_USE_CXX11_ABI is set to be as same as torch._C._GLIBCXX_USE_CXX11_ABI, please check here