DDP training on RTX 4090 (ADA, cu118)

nVidia just updated that thread (Standard nVidia CUDA tests fail with dual RTX 4090 Linux box - #16 by abchauhan - Linux - NVIDIA Developer Forums)

… Feedback from Engineering is that Peer to Peer is not supported on 4090. The applications/driver should not report this configuration as peer to peer capable. The reporting is being fixed and future drivers will report the following instead… (in the simplep2p test)…

Peer to Peer access is not available amongst GPUs in the system, waiving test.

II. ./streamOrderedAllocationIPC
Device 1 is not peer capable with some other selected peers, skipping

Thats a bummer for many who bought multi-4090 gpus for ML.

1 Like