Gradients of tensor received from another GPU

Hi,

I was planning to process some data on a GPU (let me call it GPU1) and then send a tensor from this GPU1 to another GPU (GPU2) using torch.distributed.send and torch.distributed.recv.

I was wondering, does the received tensor on GPU2 keep the gradients history from its previous life on GPU1? Is it possible to apply backpropagation ?

Thanks in advance for your help

Your use case sounds like model parallel, so I’m unsure if you would really need to use send/recv or could use this simple example.

Thanks for your reply. Sorry for the incomplete explanation of the the problem. What I want is not similar to that you example. I want to code a parallel solver. So, each gpu is solving part of the problem in parallel and after some operations some of them have to share their tensors in pairs. I was wondering if the gradients of those tensor being transferred are lost and if they can be backpropagated.

There is no backpropagation for send and recv. You can use the RPC Framework: Distributed RPC Framework — PyTorch 1.9.0 documentation and that will allow you to backpropagate across RPC calls.

Alternatively, you could do this yourself via autograd functions, ex: pytorch/functional.py at master · pytorch/pytorch · GitHub. You can find docs for autograd functions here: Automatic differentiation package - torch.autograd — PyTorch 1.9.0 documentation