Help compile pytorch 2.0.1 with CUDA 11.4

Attempting to compile pytorch 2.01 with CUDA 11.4 on Xavier NX,
using :


    export DEBUG=1
    export BUILD_TEST=0
    export USE_CUDA=ON
    export ENABLE_CUDA=ON
    export CUDACXX=/usr/local/cuda/bin
    export CMAKE_CUDA_FLAGS=-std=c++17
    export CUDNN_INCLUDE_DIR=/usr/include/aarch64-linux-gnu/
    export CUDNN_LIB_DIR=/usr/lib
    export USE_DISTRIBUTED=0
    export USE_MKLDNN=0
    export USE_FBGEMM=0
    export USE_NNPACK=0
    export USE_QNNPACK=0
    export USE_XNNPACK=0
    export BUILD_CAFFE2_OPS=0
    export PATH=/home/nvidia/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin
python build --cmake-only
cmake --build build

but in resulting CMakeCache.txt

//Use CUDA

and therefore pytorch compiles without CUDA
same result with

export USE_CUDA=1

Any pointer appreciated, am I missing something obvious here?