Why would functional and non-functional broadcast use `src` with different semantics?

Hi, I am wondering why the src means differently in torch.distributed._functional_collectives.broadcast and torch.distributed.distributed_c10d.broadcast. In functional version, src is the local rank inside the group, while in the non-functional version, src is a global rank. This is kind of misleading to users that first use functional version.

Agree that is not ideal, but because torch.distributed.distributed_c10d.broadcast is the original collective we can’t change it for backwards compatibility reasons. The functional version must always pass in a group so it can use the local rank

1 Like