Error of compiling torch.utils.cpp_extension with cuBLAS

DanielWang · January 1, 2021, 3:54am

Hi, All

I recently encountered a problem when building a customized PyTorch operator using torch.utils.cpp_extension. In my kernel, I call cuBLAS_v2 for GEMM operations. While it can compile without errors. When I import the operator in Python, it crashed, giving me the following message.

test_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: __cudaRegisterLinkedBinary_50_tmpxft_0000403a_00000000_7_test_cuda_kernel_cpp1_ii_7487cc74

However, when I remove all cuBLAS related code in my kernel and recompile it, it can successfully run without any error.

Here are the options I use in my setup file.

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
name=‘test’,
ext_modules=[
CUDAExtension(
name=‘test_cuda’,
sources=[‘test_cuda.cpp’, ‘test_cuda_kernel.cu’],
extra_compile_args={‘cxx’: [‘-O3’],
‘nvcc’: [‘-dc’, ‘-lcublas’, ‘-arch=sm_86’, ‘-lcuda’, ‘-lcudart’, ‘-lcudadevrt’]}
}
)
],
cmdclass={
‘build_ext’: BuildExtension
})

Thanks!

googlebot · January 1, 2021, 1:36pm

I’m not certain, but moving -lcublas to last position in list may help.