Build CUDAExtension on non-CUDA environment

CUDAExtension in may fail with an environment where CUDA isn’t installed. What is the best practice to make CUDAExtension compatible with a non-CUDA environment?

Traceback (most recent call last):
  File "", line 23, in <module>
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/torch/utils/", line 476, in CUDAExtension
    library_dirs += library_paths(cuda=True)
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/torch/utils/", line 555, in library_paths
    if (not os.path.exists(_join_cuda_home(lib_dir)) and
  File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/torch/utils/", line 1146, in _join_cuda_home
    raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
1 Like

You can’t build a CUDA extension without the CUDA headers and libraries. I think your main options would be to either require the CUDA toolkit (I don’t think you should actually need a GPU, just the toolkit), or have separate CUDA and non-CUDA versions of your extension.
If going with separate versions you should be able to either have a shared codebase with separate CUDA specific stuff. Or use standard C++ tricks like defines to use the same codebase while excluding CUDA stuff on CPU only builds.
From the python end you’d want a custom setuptools command to allow custom command line options to control whether you install a CUDAExtension or a CppExtension. Or just have separate python packages.

Thank you for the great answer!

No problem. Actually, while I think the basic idea of selective compilation is correct, you can do the setup side more easily by just checking for the CUDA toolkit and if detected install a CUDAExtension otherwise install a CppExtension. See torchvision for an example of how to do this nicely. Especially around here where, removing all the actual config, you have this sort of thing:

def get_extensions():
    if (torch.cuda.is_available() and CUDA_HOME is not None) or os.getenv('FORCE_CUDA', '0') == '1':
        extension = CUDAExtension
        define_macros += [('WITH_CUDA', None)]
    return extension(define_macros)


Where that define is used to add in CUDA stuff in the C++. Much cleaner than my suggestion of a command.