How to broadcast tensors using NCCL?

wanchaol · June 7, 2022, 4:59am

I answered this in another post *deadlock* when using torch.distributed.broadcast

Basically when you are adding collectives like broadcast, please make sure it’s called on all ranks rather than only rank 0 (like all_reduce in the script), this should resolve this issue. Let me know if it not works.