I ran the DDP code and it got stuck. I opened this issue:
Solve stuck issue by setting NCCL_P2P_LEVEL=NVL · Issue #1 · rickyang1114/DDP-practice (github.com)
I got the solution from this link:
DDP gets stuck on A40 GPUs · Issue #73206 · pytorch/pytorch (github.com)
My question is why setting NCCL_P2P_LEVEL=NVL is useful? Is that a bug to be fixed?