How to access the element within the cudaTensor

There is a const Tensor& tensor in a .cpp file, and the tensor is of cudaTensor type. I want to access the elements within the tensor and make a change. I create .cu file to define a kernel to change the elements within the tensor, but the problem here is that
I am not sure if it is a correct way to pass tensor.data_ptr() as the parameter to cuda kernel and use *tensor.data_ptr() to access the element in the tensor in the cuda kernel. If not, is there other way to access the data?

What I tried is as following:
in .cpp
tensor_change((float*)tensor.data_ptr(), number);
in .cu
global void tc(float* po, int number) {
printf("%d\n", number);
printf(“pval = %p\n”, po);
printf(“val = %f\n”, *po);
printf(“val_next = %f\n”, *(po+1)); }

void tensor_change(float* po, int number){
tc<<<1, 1>>>(po, number); }

The result is:
pval = 0x7faf27200000
val = 0.000000
val_next = 0.000000

Any help will be appreciated!

Have you tried those 2 ways:

  1. using index API,



CUDA accessors
global void packed_accessor_kernel(
PackedTensorAccessor64<float, 2> foo,
float* trace) {
int i=threadIdx.x
gpuAtomicAdd(trace, foo[i][i])

torch::Tensor foo = torch::rand({12, 12});

// assert foo is 2-dimensional and holds floats.
auto foo_a = foo.packed_accessor64<float,2>();
float trace = 0;

packed_accessor_kernel<<<1, 12>>>(foo_a, &trace);

BTW, accessor API has higher speed compared to index API, it is more suitable if you need large amount of API calls.

Really appreciate your help! And sorry that I missed this message. Accessor API is a good approach.