NVLS support in pytorch

Does PyTorch support NVLS? If not, how does it manage to call NCCL’s NVLS algorithm using `torch.distributed.all_reduce?

1 Like

I think it is not supported as I see from:

It’s possible to use NVLS via the torch.cuda.MemPool API which landed in this PR. We are also working on enabling it in e.g. DDP related to this PR.