I was planning to process some data on a GPU (let me call it GPU1) and then send a tensor from this GPU1 to another GPU (GPU2) using torch.distributed.send and torch.distributed.recv.
I was wondering, does the received tensor on GPU2 keep the gradients history from its previous life on GPU1? Is it possible to apply backpropagation ?
Thanks for your reply. Sorry for the incomplete explanation of the the problem. What I want is not similar to that you example. I want to code a parallel solver. So, each gpu is solving part of the problem in parallel and after some operations some of them have to share their tensors in pairs. I was wondering if the gradients of those tensor being transferred are lost and if they can be backpropagated.