Error using CUDNN in custom cuda file

Karthik_Ganesan · November 17, 2021, 4:22pm

I am writing my own CUDA implementation of a CNN and to do so, I’m compiling it as a CPP extension file, using Ninja. All the base CUDA code (e.g., malloc, memcopy, kernel launch etc) works fine, but I am having an issue using CUDNN in my custom extension.

I’m running PyTorch 1.10 using a Conda environment. PyTorch is using CUDNN, ( torch.backends.cudnn.enabled == True and torch.backends.cudnn.version() returns 8200).
I also have CUDNN installed in my /usr/local/ folder, since I think this is CUDA code that Ninja uses?

When I try to call CUDNN functions, it builds fine, but when I call it from the Python script, I get an error:
undefined symbol: cudnnCreateTensorDescriptor

<cudnn.h> is included in my .cu file. Also, when I compile this code using nvcc directly (and I pass -lcudnn flag), it compiles and runs fine. But if I do the same in my setup.py file when I build the extension, it still gives me the same error.

Specifically, I added the extra_compile_args line to my setup.py file to build the extension:

CUDAExtension(
              name='<cuda code>',
              sources=['<cuda file>.cu'],
              extra_compile_args={'nvcc': ['-lcudnn']}
          )

I also confirmed in my build log that -lcudnn is being passed to nvcc. But I still get the same error.

Is there something else I need to do to get nvcc to link the cudnn library in when it builds the extension?

ptrblck · November 17, 2021, 10:44pm

Could you try to pass -lcudnn into extra_ldflags = ["-lcudnn"]?

Karthik_Ganesan · November 18, 2021, 3:22pm

No luck. I added extra_ldflags = ["-lcudnn"] as you said but this time, it didn’t even add -lcudnn as an argument to nvcc when I built it.

When I did extra_compile_args={'nvcc': ['-lcudnn']} it did add -lcudnn as an argument but still had the error I showed above.

EDIT: I also just noticed this line at the top of the outputs when I build the extension:

UserWarning: Unknown Extension options: 'extra_ldflags'

ptrblck · November 18, 2021, 4:36pm

That’s strange, as it’s used in this test. Could you check if you could run and build this test?

Karthik_Ganesan · November 18, 2021, 5:07pm

Sorry, I’ve never run any tests that come with PyTorch. Is there any documentation on how I can run this test to check?

ptrblck · November 18, 2021, 7:48pm

You can git clone the PyTorch repository and launch the code via python test/test_cpp_extensions_jit.py -v. In case you are using Python>=3.8 you could filter out tests via -k cudnn.

Karthik_Ganesan · November 18, 2021, 8:57pm

So I cloned the 1.10.0 branch of the repo but do I need to build pytorch again here for it to work? I’ve tried that before and had a load of issues. Is there no way to run this test using the version of PyTorch I have installed in a conda env?

Just in case, I gave that a quick try but it gave me an error for the line import expecttest. So I guess there are other things that have to be included for testing to work with the installed version of PyTorch?

ptrblck · November 19, 2021, 4:08am

You can just pip install expecttest in your current environment and execute the test afterwards.
A source build or the binaries would work.
E.g. I just reran the test using the 1.10.0+cu113 pip wheels in the source folder:

python test_cpp_extensions_jit.py -v -k cudnn
Fail to import hypothesis in common_utils, tests are not derandomized
test_jit_cudnn_extension (__main__.TestCppExtensionJIT) ... Using /opt/.cache/torch_extensions/py38_cu113 as PyTorch extensions root...
Creating extension directory /opt/.cache/torch_extensions/py38_cu113/torch_test_cudnn_extension...
Detected CUDA files, patching ldflags
Emitting ninja build file /opt/.cache/torch_extensions/py38_cu113/torch_test_cudnn_extension/build.ninja...
Building extension module torch_test_cudnn_extension...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF cudnn_extension.o.d -DTORCH_EXTENSION_NAME=torch_test_cudnn_extension -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include/TH -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/miniforge3/envs/nightly_pip_cuda113/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /opt/libs/upstream/pytorch/test/cpp_extensions/cudnn_extension.cpp -o cudnn_extension.o 
[2/2] c++ cudnn_extension.o -shared -lcudnn -L/opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda_cu -ltorch_cuda_cpp -ltorch -ltorch_python -L/usr/local/cuda/lib64 -lcudart -o torch_test_cudnn_extension.so
Loading extension module torch_test_cudnn_extension...
ok

----------------------------------------------------------------------
Ran 1 test in 10.621s

OK

Karthik_Ganesan · November 19, 2021, 2:59pm

Thank you! So I was able to run the test as well and it produced identical results to what you showed. However, it seems the test is calling c++ while in my case it is called nvcc. And -DPYBIND11_COMPILER_TYPE=\"_gcc\" means its using GCC then? How is GCC able to compile CUDA code?

Also, this is using the CUDNN version inside my conda env? When I do the non-JIT way to build it, its using CUDNN in my /usr/local directory instead.

youkis · March 14, 2023, 7:25am

CUDAExtension(
              name='<cuda code>',
              libraries = ["cudnn"],
              sources=['<cuda file>.cu'],
              extra_compile_args={'nvcc': ['-O3']}
          )

I faced same error on lower version of pytorch torch==1.9. And adding libraries = ["cudnn"], solved the issue.