Windows DDP on RTX 50-series only: use_libuv was requested but PyTorch was built without libuv support (works on 40/20-series)

Swati_sd · October 25, 2025, 3:07am

Goal

Run PatchCore (FAISS + PyTorch) with DDP on Windows using python -m torch.distributed.run --nproc_per_node=2.

Command

set USE_LIBUV=0 && set MASTER_ADDR=127.0.0.1 && set MASTER_PORT=29500 ^
&& python -m torch.distributed.run --nproc_per_node=2 train_tuple.py --ddp --use_gpu

(Also tried torchrun alias, same result.)

Error (excerpt)

torch.distributed.DistStoreError: use_libuv was requested but PyTorch was built without libuv support, run with USE_LIBUV

System & environment

OS / GPU

Windows 10/11 (x64)
(RTX 5090) — Driver version: (please advise if you need exact nvidia-smi output; can share)

Python

python 3.11.14

PyTorch / CUDA / FAISS (conda + pip mix)

torch                        2.10.0.dev20251022+cu128  (pip)
torchvision                  0.25.0.dev20251022+cu128  (pip)
torchaudio                   2.10.0.dev20251022+cu128  (pip))
faiss                        1.9.0  py311cuda126...    (conda-forge)
faiss-gpu                    1.9.0  ...                (conda-forge)
cuda-toolkit                 12.8.0                    (conda, NVIDIA label)
cudnn                        9.13.1                    (conda-forge)

the error message specifically complains about libuv rendezvous.

Things I already tried

Setting USE_LIBUV=0 in the shell and at the very top of the script (before any torch import).
Explicit backend="gloo" in init_process_group.
Using python -m torch.distributed.run vs torchrun.
Ensuring MASTER_ADDR/MASTER_PORT are set and free.
Verifying single-process execution works.
Verified that os.environ["USE_LIBUV"] prints 0 inside the script right before importing torch.distributed.
I have looked into all the forums and tried various version of nightly and different suggested combinations of torch and cuda the error still remain.

Questions for the community

Is libuv rendezvous expected/required on current PyTorch nightlies for Windows, or should USE_LIBUV=0 fully disable it?
What’s the recommended, known-good install matrix for:
- Windows + CUDA 12.6/12.8
- PyTorch (stable/nightly) + torchvision/torchaudio
- FAISS (GPU)
  so that DDP (gloo backend) works with nproc_per_node>1?
If libuv rendezvous is the intended default on Windows nightlies, how do I force TCPStore/gloo without libuv, or how do I install a libuv-enabled/non-libuv PyTorch build for Windows?
Any guidance on FAISS build compatibility with the above stack on Windows would also be appreciated (I’m fine to re-create a clean env).

What works vs fails (using the same exact code base)

RTX 4090 — OK (DDP with nproc_per_node=2)
RTX 2080 Ti — OK
RTX 50-series — fails with the libuv rendezvous error