Is is possible with torchrun to only start a rendezvous server with no local workers? I have a pool of potential GPU workers, all of which can come and go, and a different, more reliable machine (without a GPU) that I’d like to use to run only the rendezvous server.
I tried running torchrun --nproc-per-node 0
, to try to launch just the c10d rendezvous server on a machine, but this yielded an error.