Kernel Launch Deprecated: packed_accessor arguments and `Tensor.type()` alternative

When compiling the code below, there are a lot of deprecated warnings.

AT_DISPATCH_FLOATING_TYPES(x.type(), "comp_cuda",
    ([&] {
        comp_cuda_kernel<scalar_t><<<grid_size, threads>>>(
        x.data<scalar_t>(), h.data<scalar_t>(), out.data<scalar_t>(),
        B, F, C, H, W);
    }));

As far I noticed, the warnings basically are:

warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]

And

warning: ‘T* at::Tensor::data() const [with T = double]’ is deprecated: Tensor.data() is deprecated. Please use Tensor.data_ptr() instead. [-Wdeprecated-declarations]

The documentation, here here, does not provide an alternative to x.type(). However it does suggests a different way to refer to x.data<scalar_t>() as in

gates.packed_accessor<scalar_t,3,torch::RestrictPtrTraits,size_t>(),

What specifically are these arguments?

  • 3
  • torch::RestrictPtrTraits

Would the first and last argument always be scalar_t and size_t?

As the warning suggests t.options() to get something that can be used with the factory functions and the getters of individual things (e.g. t.scalar_type() for the ScalarType aka dtype).

Do you want a pointer or an array-like object? If you want the pointer you got from .data before, it’s now called .data_ptr, which is a drop-in replacement.

If you want an array type object: .packed_accessor32 (preferred over the version with size_t if you know it is sufficient) has 3=dimension of the tensor (fixed at compile time, checked at runtime) and the traits (DefaultPtrTraits is without extra qualification, Restricted adds the __restrict__ to the type, meaning that the compiler may assume that the tensor does not overlap with (written-to) tensors, which allows more aggressive caching.
The first is the C/CUDA data element type of the tensor (fixed at compile time, checked at runtime). The latter should ideally not be size_t but int32_t if that works or int64_t and is the datatype for the indexing arithmetic (32bit is much faster than 64bit on CUDA).

Best regards

Thomas

Thank you so much for your feedback and clarification.

By checking their code I noticed that the variable gate can be accessed as a vector instead of as a pointer.


With the Deprecation warning, it seem to be more than a matter of preference.


In the case of the out tensor in my case, and output_gate from the example, the tensor provided to have data written to. This provided argument, however, it is still given using packed_accessor. It don’t think I understood what you mean. Could you expand a bit more on that?


Also, in my case, the tensors are of type float32. However, I think you are referring to the index type, right? It never occurred to me to think about the indexes type. I think int32_t should be enough for my cases since I don’t plan having a tensor that has size bigger than 2^32.


Thank you for your time.