No kernel image is available for execution on the device pytorch1.1 compiled from source

Hi,

I have the following problem:

THCudaCheck FAIL file=/home/dhhorka/libs/pytorch/aten/src/THC/THCTensorMathCompareT.cuh line=69 error=48 : no kernel image is available for execution on the device.

I compiled from source pytorch 1.1 (using the code tagged as v1.1.0). The thing is, I am running this code in a remote server where there are two different nodes with different gpus architectures. I compiled the code in the node that contains old gpus . WhenI execute the pytorch script on these GPU’s the code code is working properly but… when I execute this code on the new gpus (2080ti) I get the previous error.

I tried to compile pytorch in the new gpus and then the code It is working on the new gpus but it does not work in the old gpus… I am using cuda 10. Is there anyway to make it work in both GPUs?

P.S: I do not put a category to this topic because I am not pretty sure witch one fits better in this case.

1 Like

Hi @Dhorka, I am having this issue too. Were you able to solve it or find help somewhere else?

This error is raised, if your PyTorch binary or source build wasn’t compiled with the right compute capability for your GPU.

Which GPU are you using and how did you install / build PyTorch?

Is there to build pytorch from source for more than one gpu capability? The thing is that I have two different gpus with different gpu capabilities…

You can build for multiple GPU architectures at once by setting TORCH_CUDA_ARCH_LIST, for example we build with:

export TORCH_CUDA_ARCH_LIST="3.7;6.0;7.0;7.5"

Wikipedia has a good map of adapters to compute versions: https://en.wikipedia.org/wiki/CUDA

1 Like

The binaries should are built with the same architecture list as posted by @hartbx, if I’m not mistaken.

I am using an Nvidia Tesla v100.

Thanks for pointing out the error. This is what happened to me indeed.

More specifically, I was running a script on a remote cluster using GPUs and retrieving some GPU tensors. While trying to read this tensors in my local computer (which only supports CPU tensors) I encountered this error.

I’m now converting these tensors while on the cluster.

Thanks a lot!