Add -rdc=True flag when compiling C++ CUDA code with setup.py

Hello,

I am trying to make a custom C++ CUDA kernel to use in my PyTorch code. This is my setup.py file. The problem is, by default, the compilation command the setup.py invokes does not have -rdc=True in the command, which I need to add for my C++ code. Is there a way I can add this somewhere in the setup.py file?

Thank you.

from setuptools import setup, Extension
from torch.utils import cpp_extension
from torch.utils.cpp_extension import BuildExtension, CUDAExtension, load

setup(
    name='cpp_kernel',
    ext_modules=[
        CUDAExtension('cpp_kernel', [
            'main.cpp',
            'cuda_kernel.cu',
        ])
    ],
    cmdclass={
        'build_ext': BuildExtension
})

The default command is this:

@ptrblck, I thought of tagging you here in case you may know about this (or know someone who I could ask). Thank you!

Relocatable device code linking was added in this PR and is explained in this doc.

@ptrblck, I modified my setup.py file to this:

from setuptools import setup, Extension
from torch.utils import cpp_extension
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
    name='cpp_kernel',
    ext_modules=[
        CUDAExtension(
            name='cpp_kernel',
            sources=['main.cpp', 'cuda_kernel.cu'],
            extra_compile_args={'nvcc': ['-lcudadevrt', '-rdc=true']})
    ],                   
    cmdclass={
        'build_ext': BuildExtension.with_options(use_ninja=False)
})

Upon compiling, I now get this error that I cannot figure out how to solve. If you know how to fix this, can you please tell me? Thank you so much for your help.

(I highlighted the error at the bottom of the output)