Source build succeeds, but is not usable

I’ve been trying to install PyTorch unstable from source with cuda8.0 support, which should be possible with the Titan X graphics card I use. However, whenever I do some actual computations I get the following error:

$ python3.6
>>> import torch
>>> torch.cuda.is_available()
>>> x = torch.rand(5, 3)
>>> y = torch.rand(5, 3)
>>> x = x.cuda()
>>> y = y.cuda()
>>> x+y
THCudaCheck FAIL file=/home/rwever/Devil/pytorch/aten/src/THC/generated/../generic/ line=265 error=8 : invalid device function
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: cuda runtime error (8) : invalid device function at /home/rwever/Devil/pytorch/aten/src/THC/generated/../generic/

The build process does indicate that it finds cuda 8.0, and shows no interesting warnings (I think) or any errors

Found CUDA: /usr/local/cuda-8.0 (found suitable version "8.0", minimum required is "5.5")
-- Building with NumPy bindings
-- Detected cuDNN at /usr/local/cuda-8.0/lib64/, /usr/local/cuda-8.0/include
-- Detected CUDA at /usr/local/cuda-8.0

I am using the same build flags that I’ve been using on another machine, without any luck unfortunately.

export CUDA_HOME=/usr/local/cuda-8.0
export TORCH_CUDA_ARCH_LIST="6.0;6.1"
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64

and my $PATH variable contains /usr/local/cuda-8.0/bin as the first entry. I’m using python3.6.5 and Linux Mint 17.1.
One notable quirck might be that I install with $ python3 install --user as I don’t have root permissions.


why do you set export TORCH_CUDA_ARCH_LIST="6.0;6.1" ? This should be automatically detected if you don’t specify it. You can check the result of the detection on the line just after the Found CUDA:... .
If you have original TitanX, then their compute capability is neither of these two and so you don’t compile code for the GPU you actually have.

I’m setting that particular cuda arch value because of this information, which lists it as having that compute capability:

The build did not work on another machine that I’ve used with a GeForce GTX 1080 Ti without setting the value to 6.1, so I figured I had to also set it for the Titan X (which is newer/more powerful). And it should support those archs according to the nvidia site right? How can I find which arch it can maximally support?

The thing is that the naming “Titan X” refers to many different cards from different generations. If you don’t set this env variable at all what happens? Does the detection fails?

I see. I think i have the GeForce GTX TITAN X from 2015, according to wikipedia and this command:

$ nvidia-smi --query-gpu=name --format=csv,noheader`

Supposedly, it has a compute capability of 5.2, which have not tried I think.

I didn’t get it to build without setting TORCH_CUDA_ARCH_LIST for other cards, but I’m trying without it now.

If it’s a 2015 one, then the compute capability is 5.0 iirc.
Still that should be detected by the build automatically. I think the 1080 detection was buggy when they just came out. But it should always work now.

Ah right, my previous problem indeed concerned a 1080.

Everything seems to work now and can be considered solved, thanks for your help. Correct maximum cuda arch was 5.2 btw.