Specifying gpu or compute capability when compiling with cmake

Problem:
I have two GPUs in my system, one Geforce 960 for display and another 3080 for compute. I don’t need to compile my cmake + pytorch program for the 960, because it’ll never be executed on it. I also can’t compile it for the 960, because it has an low compute capabilities and doesn’t support all the features that i’m using.

The problem is, that find_package(Torch REQUIRED) automatically selects compiler options to support all available GPUs. specifying set(CMAKE_CUDA_ARCHITECTURES 86) in cmake is overriden by the find package torch script.

cmake logs

-- Autodetected CUDA architecture(s):  8.6 5.2
-- Added CUDA NVCC flags for: -gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_52,code=sm_52

compilation then fails because sm_52 is too low for the code.

Solution:
I started writing the question, but while doing so I found the solution. I’m writing it here, maybe it’ll help others.

Same as with other CUDA programs, you can export CUDA_VISIBLE_DEVICES=0 before calling cmake. In my case the order of devices did not agree with nvidia_smi, that’s why it didn’t work at first (it shows the 960 first and 3080 second, with indices 0 and 1, respectively).

Instead of masking the device with CUDA_VISIBLE_DEVICES you could also specify the architectures via TORCH_CUDA_ARCH_LIST=8.6 python setup.py install.

1 Like