I want to figure out how Pytorch copy data from one GPU to another when the board of a server doesn't support P2P

Hi all!

I’m just wondering how Pytorch copy data from one GPU to another when the board of your server doesn’t support P2P.

It it clear that we can copy Tensor by using “.to()” function.

But, I want to understand the process in detail.

What I’ve found is that Pytorch use ‘cudaMemcpy’ or ‘cudaMemcpyAsync’ function at lower Level.

However, if the board of a server doesn’t support P2P, then it is impossible to use ‘cudaMemcpy’ and ‘cudaMemcpyAsync’ to copy data to another GPU.

Without UVA(Unified Virtual Addressing), It is unable to copy data from a GPU to another GPU by using ‘cudaMemcpy’ and ‘cudaMemcpyAsync’.

What made me crazy was this Pytorch.forum link saying Pytorch does not support UVA!

I would really appreciate it if you tell me how Pytorch ‘to()’ function work at the C language level on the Non-p2p board.

Thank you.