Hi there,
i wrote a custom cuda kernel for pytorch 2.3.0 and cuda 12.1,
my nvcc-version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:34:21_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0
and compiled it via setuptools: e.g.
from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
setup(
name='test',
ext_modules=[
CUDAExtension('test_k', [
'test.cpp',
'test_cuda_kernel.cu'
])
],
cmdclass={
'build_ext': BuildExtension
})
The kernel was compiled on an rtx3090, however when trying to run it on an a100 i get the following error:
RuntimeError: CUDA error: no kernel image is available for execution on the device
What am i doing wrong?