Loading a tensor in Cython

I am trying to speed up Non Max Supression on the GPU. I need to pass a Tensor into Cython but there doesn’t seem to be a way except using torch extensions. Does anyone any?