Debug a custom cuda kernel

Hi all,

I am wondering if there is any instruction on how to debug custom cuda kernels which are loaded using torch.utils.cpp_extension.load. What I want to do is: 1. start a gdb session; 2. set breakpoints in the .cu files; 3. start the python script running (where the specified nvcc flags passed to torch.utils.cpp_extension.load contain the -G flag).

I failed to let the program stop at the desired place.

Looking forward to your suggestion.


Use cuda-gdb to debug CUDA kernels, which should allow you to set breakpoints in your kernel.

1 Like