[CUDA/MSVC][Suggestion] ROI Pool half-precision build error due to ambiguous comparison

Hi all,

I ran into a build issue when compiling the ROI Pool CUDA kernel on Windows + MSVC + CUDA with T=half. The kernel fails to compile due to an ambiguous comparison operator.

Environment

  • OS: Windows 10/11

  • Compiler: MSVC 19.x (Visual Studio 2022)

  • CUDA: 12.x

  • PyTorch / torchvision: Libtorch(2.3.1) + torchvision csrc

Problematic code

In torchvision/csrc/ops/cuda/roi_pool_kernel.cu, inside roi_pool_forward_kernel_impl:

if (offset_input[input_index] >maxval) {
          maxval = offset_input[input_index];
          maxidx = input_index;
        }

Likely cause

  • For __half / c10::Half, the availability of operator overloads depends on macros such as __CUDA_NO_HALF_OPERATORS__ and __CUDA_NO_HALF_CONVERSIONS__.
  • On MSVC+NVCC, the implicit promotion from half → float appears to be treated more strictly, resulting in ambiguity.
  • On GCC/Clang this may not trigger, which is why others may not have reported it.

I resolved this problem with static_cast<float>, but it’s not a good enough solution for situations like this.

Better solution

Following PyTorch conventions, we could use an accumulation type to ensure consistent promotion:

using acc_t = at::acc_type<T, /*is_cuda=*/true>;
acc_t v  = static_cast<acc_t>(offset_input[input_index]);
acc_t mv = static_cast<acc_t>(maxval);

if (v > mv) {
    maxval = offset_input[input_index];
    maxidx = input_index;
}

And also initialize maxval in a type-safe way:

T maxval = is_empty ? T(0) : std::numeric_limits<T>::lowest();

Suggestion

Would it make sense to update the ROI Pool kernel to use acc_type (and lowest()) to avoid this build-time ambiguity and preserve precision across float/double/half?

I can open a PR if maintainers think this is the right direction.

Please note that I’m still a junior-level software engineer, and this is my first time posting on the forum. I apologize in advance if my message is not fully aligned with the usual standards, and I appreciate your understanding.

Thank you for your description! I thinks it would be great if you could open an issue on GitHub describing your proposal there as well so the module maintainers could take a look and review it.

Thanks for your reply! I’ve opened an issue here: [CUDA/MSVC][Suggestion] ROI Pool half-precision build error due to ambiguous comparison · Issue #9246 · pytorch/vision · GitHub

I’m not entirely sure if I submitted it in the correct way, but I thought it would be better to share it here first.

Really appreciate your suggestion and guidance!