Nccl error in torch._C._dist_broadcast(tensor, src, group) when train in two nodes

Use NCCL_SOCKET_IFNAME to specify the ip interface.

1 Like