Writing a custom C++ extension with variants for both CPU and GPU?

Hi.

I am interested in writing a custom C++/CUDA extension.

The tutorial here only shows a scenario where you have a pure CUDA kernel which will not work on a machine which doesn’t have CUDA. I’d like to make an extension which uses a CPU version OR the CUDA version automatically, akin to what PyTorch itself does. I was wondering if there was a way to use the dispatcher in PyTorch to accomplish this in the extension. I know this can be done using if statements or #if preprocessors. But I’d like a slightly more “automatic” solution.

Thank you!

P.S.: I have already searched the forums for this topic but couldn’t find a solution.