Custom cuda modules and deviceTensor

Hello,

I’m trying to build my own custom Module, and I hope to base my work on GoldsBorough’s Tutorial : https://github.com/pytorch/extension-cpp

The tutorial is very nice and informative, but when reading the source code of some standard modules in pytorch’s source code, I can see that some metadata is encapsulated in a “deviceTensor” so that you don’t have to specify data pointer, sizes and strides to every cuda kernel. This is don e.g. for Grid Sampling : https://github.com/pytorch/pytorch/blob/master/aten/src/THCUNN/SpatialGridSamplerBilinear.cu

Is there a way to use something similar in a custom module ? Goldsborough fortunately only needs one meta value so it just gives data pointers to the Cuda Kernel, but I’m trying to write my own clean and multitypes implementation of Correlation module (a Cuda-only with old style cffi can be found here : https://github.com/NVIDIA/flownet2-pytorch/blob/master/networks/correlation_package/src/correlation_cuda_kernel.cu) and you get a lot of meta values, and the code can become messy in the end.

Is there a way to use deviceTensor outside of pytorch source code ? can you convert an at::Tensor to a deviceTensor ? It’s obviously not straight forward, since it seems at::Tensor is voluntarily agnostic to the tensor type and device (CPU or GPU). Apparently there exists a function, but it needs a THCTensor for input : https://github.com/pytorch/pytorch/blob/master/aten/src/THC/generic/THCDeviceTensorUtils.cu#L5

So maybe a routine for converting at::Tensor with the dispatch Macro that seems to generate the code for all kinds of wanted types ?

Thanks for your help !