How the multiple GPUs interconnect when using the DataParallel to train model on multi-GPUs

Hi all,

I try to train a model by using multi-GPUs on single machine. And I want to figure out how the multiple GPUs connect with each other, (I mean the shape of the connection, full connect? in parallel? or in series?


If you want to check how the devices are connected inside your machine, you could run nvidia-smi topo -m in your terminal.

I might have misunderstood the question, but nn.DataParallel replicated the model on each device as explained here.

@ptrblck thanks for your reply! I used the DataParallel module, and after I checked the code, I found that the GPU0 will gather the results from other GPUs like GPU1,2 and then it processes some computations by itself and then scatter to others. So I want to check the network between them (the way of communication). And how the GPU communicate with the CPU in this DataParallel approach. I tried to dive into the C++ code, however, I am not clear where could I search for this.


You can find the implementation in

It helps. Thanks for your help