NCCL in PyTorch

Is there any NCCL or CUDA examples with PyTorch? I want to use nccl to synchronize the data across multi gpus.

1 Like