Compiling Pytorch on devices with different cuda capability

Hi,

I have compiled pytorch on a machine that has an Nvidia GPU with cuda capability (cc) 7.5. I tried using the same compiled libraries on machine that has an Nvidia GPU with cc of 6.1 and it worked, but I got the following error

CUDA error: no kernel image is available for execution on the device

when I tried to use the same on an Nvidia GPU with cc of 7.0

Based on https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compilation-workflow, the compiled code should be forward compatible only across minor versions if binary code is generated and forward compatible across minor and major versions if PTX code is generated. Consequently, I’m not sure how the above case would have occurred. Appreciate any explanation and insights into how pytorch interacts with CUDA.

Note -
Pytorch version compiled 1.4.0 (release/1.4 branch)

I guess you haven’t used PTX and only built PyTorch for sm_75.
You could use TORCH_CUDA_ARCH_LIST="6.1 7.5" python setup.py install to build for both architectures. You can also add +PTX after each architecture to ship with PTX.

1 Like

Thanks! Wasn’t aware of the PTX flag. Does this flag generate just the CUDA assembly code for the particular architecture or does it also generate the binaries?

The +PTX flag generates the PTX ISA, which can be used to create the SASS. I’m not sure what binaries refers to.

1 Like