I am writing my own CUDA implementation of a CNN and to do so, I’m compiling it as a CPP extension file, using Ninja. All the base CUDA code (e.g., malloc, memcopy, kernel launch etc) works fine, but I am having an issue using CUDNN in my custom extension.
I’m running PyTorch 1.10 using a Conda environment. PyTorch is using CUDNN, ( torch.backends.cudnn.enabled == True and torch.backends.cudnn.version() returns 8200).
I also have CUDNN installed in my /usr/local/ folder, since I think this is CUDA code that Ninja uses?
When I try to call CUDNN functions, it builds fine, but when I call it from the Python script, I get an error: undefined symbol: cudnnCreateTensorDescriptor
<cudnn.h> is included in my .cu file. Also, when I compile this code using nvcc directly (and I pass -lcudnn flag), it compiles and runs fine. But if I do the same in my setup.py file when I build the extension, it still gives me the same error.
Specifically, I added the extra_compile_args line to my setup.py file to build the extension:
You can git clone the PyTorch repository and launch the code via python test/test_cpp_extensions_jit.py -v. In case you are using Python>=3.8 you could filter out tests via -k cudnn.
So I cloned the 1.10.0 branch of the repo but do I need to build pytorch again here for it to work? I’ve tried that before and had a load of issues. Is there no way to run this test using the version of PyTorch I have installed in a conda env?
Just in case, I gave that a quick try but it gave me an error for the line import expecttest. So I guess there are other things that have to be included for testing to work with the installed version of PyTorch?
You can just pip install expecttest in your current environment and execute the test afterwards.
A source build or the binaries would work.
E.g. I just reran the test using the 1.10.0+cu113 pip wheels in the source folder:
python test_cpp_extensions_jit.py -v -k cudnn
Fail to import hypothesis in common_utils, tests are not derandomized
test_jit_cudnn_extension (__main__.TestCppExtensionJIT) ... Using /opt/.cache/torch_extensions/py38_cu113 as PyTorch extensions root...
Creating extension directory /opt/.cache/torch_extensions/py38_cu113/torch_test_cudnn_extension...
Detected CUDA files, patching ldflags
Emitting ninja build file /opt/.cache/torch_extensions/py38_cu113/torch_test_cudnn_extension/build.ninja...
Building extension module torch_test_cudnn_extension...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF cudnn_extension.o.d -DTORCH_EXTENSION_NAME=torch_test_cudnn_extension -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include/TH -isystem /opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/miniforge3/envs/nightly_pip_cuda113/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /opt/libs/upstream/pytorch/test/cpp_extensions/cudnn_extension.cpp -o cudnn_extension.o
[2/2] c++ cudnn_extension.o -shared -lcudnn -L/opt/miniforge3/envs/nightly_pip_cuda113/lib/python3.8/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda_cu -ltorch_cuda_cpp -ltorch -ltorch_python -L/usr/local/cuda/lib64 -lcudart -o torch_test_cudnn_extension.so
Loading extension module torch_test_cudnn_extension...
ok
----------------------------------------------------------------------
Ran 1 test in 10.621s
OK
Thank you! So I was able to run the test as well and it produced identical results to what you showed. However, it seems the test is calling c++ while in my case it is called nvcc. And -DPYBIND11_COMPILER_TYPE=\"_gcc\" means its using GCC then? How is GCC able to compile CUDA code?
Also, this is using the CUDNN version inside my conda env? When I do the non-JIT way to build it, its using CUDNN in my /usr/local directory instead.