How does PyTorch dist ranks map to NCCL ranks?

when doing an allreduce with PT distributed, NCCL is used. In case we use NCCL ring, is the order of the NCCL ring defined by the order of the ranks set in PyTorch dist.init_process_group(rank=rank) ?

It is not controlled from PyTorch directly and NCCL doesn’t provide control to this ordering via enivornment variables but you can take a look at Environment Variables — NCCL 2.17.1 documentation and see if anything resonates

1 Like