Is it possible to use pytorch c++ frontend API in CUDA kernel

Like this,
global void function(torch::Tensor A, torch::Tensor B, torch::Tensor C, int n){
int tid = blockIdx.x * blockDim.x + threadIdx.x;
if(tid < n) {
C[tid] = A[tid] + B[tid];

I don’t think that’s possible and you would usually pass the data pointers to the method.


You mean that example code dosen’t works or libtorch api can’t use in nvcc compiler(.cu file)?

I just want to use libtorch api in .cu file
My actual problem is when i try #include<torch/torch.h> in .cu file(nvcc) there is error E1866
on “C10_DEFINE_DEPRECATED_USING(IntList, ArrayRef<int64_t>)”.
Am i wrong or Is it impossible?

You can use Tensors in .cu files as seen here. I just don’t think that the actual kernel will accept these.

Regarding the error message: could you use ArrayRef instead of IntList?

You probably want PackedTensorAccessors for this.
See e.g. the BatchNorm kernels for details (this currently is called GenericPackedTensorAccessor, if you know your required index size, you can use PackedTensorAccessor32 or 64.
They offer array-like elementwise access in kernels and can be passed (by value(!)) to the kernel instead of pointer + dimensions + strides.

Best regards


1 Like

Did you solve the problem?
are you PytorchKorea community Son?

Thanks for your reply.

sorry i am late.
I can make .cu file include <Aten.h>.
but, still there are error if i include <torch.h>

at::Tensor can not save gradient right?
so, i need torch::Tensor. Is it possible to use torch::Tensor in cuda(.cu file)