I lost some debugging time to group.allreduce(), not realizing it returned an async Work handle, unlike global torch.distributed.* methods which sync by default.
To avoid similar mistakes, I searched for documentation of its methods, but failed to find it.
If you click on ‘[Source]’, the source code with all methods of DeviceMesh will unfold. Alternatively have a look at GitHub DeviceMesh class
The functions regarding ProcessGroup are under the ‘Collective functions’ section. Or here in the .pyi interface here ProcessGroup class