I need to test a system that has multiple GPUs, linked with NVLink. What is the minimal code I could run to show NVLink actually works?
See this simple code example - how would you change it to take advantage of NVLink?
https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
DistributedDataParallel
via NCCL would use NVLink, if available.
For a quick performance test, I would recommend to run the nccl-tests and also verify the connections between the GPUs via nvidia-smi topo -m
.
1 Like
I have to actually demo PyTorch, so I’ll see if I can get DistributedDataParallel to work.
nccl-tests looks like a useful hardware / config check, I’ll give it a try.