Using gpu_reduce_kernel from a PyTorch extension

I used to be able to compile an CUDA extension which included <ATen/native/cuda/Reduce.cuh> to use the gpu_reduce_kernel function.
In pytorch 1.12 Reduce.cuh started to include ATen/native/cuda/jit_utils.h which in turn includes ATen/cuda/nvrtc_stub/ATenNVRTC.h which does not seem to be included in the headers which are distributed with pytorch.
Is there another way I can use gpu_reduce_kernel or something similar? Am I doing something I’m not supposed to by trying to use it from an extension?

This seems like a newly introduced issue, so could you create a GitHub issue for it, so that we can track and fix it, please?

Sure! For reference: https://github.com/pytorch/pytorch/issues/82408