How to write a custom cpu kernel

  • For CPU tensor, I think the equivalent would be Tensor.accessor<…>
    or you can use a vanilla Tensor.data_ptr<…>.
  • You can use at::parallel_for to enable OpenMP or other multi-threading libraries. Using at::parallel_for in a custom operator