Strange oom when compiling aten/src/ATen/native/cuda/Sort.cu

When I’m compiling caffe2 from the pytorch source, it dies consistently at aten/src/ATen/native/cuda/Sort.cu
Error message/problem:

LLVM Error: Out of memory
nvcc error: 'cicc' died due to signal 6

But the system RAM (28GiB) was never filled, and there is a lot of swap (64GiB)
What I have tried:
Unsetting Architecture: no change
Compile that single file: no change
Add -G and compile that single file: success, but slower
Add -G -dopt=on and compile that single file: same as first two
Question:
How do I let -G only be passed if aten/src/ATen/native/cuda/Sort.cu is the file being compiled? Or even better, is there a fix that doesn’t require removing all optimizations, which -G alone does?
System:
Gentoo Linux
Build tool:
Cmake (hooked by portage in non-single-file compiles)

What gpu card and cuda version do you use?
I solved this problem by changing cuda version from 12.0 to 11.8.

I didn’t, and thanks for the advice