Https://pytorch.org/docs/stable/distributed.html#which-backend-to-use

https://pytorch.org/docs/stable/distributed.html#which-backend-to-use

If you encounter any problem with NCCL, use Gloo as the fallback option
What mechanism is there in PyTorch to implement nccl fallback to gloo

You would need to specify it during the distributed initialization.