Is it possible to use a buffer for sending variables (with computation graphs) over a multiprocessing queue? I am currently gathering log probability variables for policy gradient with multiple child processes and the bottleneck at the moment is transferring the variables over to the parent process.
How can I make this transfer faster? If it was normal tensors I could just use a buffer as I would know the size but with variables that contain computation graphs I dont know how to do it.
This was so long ago I dont remember if I was able to solve it. Last time I used policy gradients I went with the hogwild style of training, where instead I share the weights over multiple processes and each process can update the weights asynchronously.
Another idea would be to use rpc. The only experience I have is with gRPC where each servicer can run in its own process, maybe that can be used in a similar way? https://pytorch.org/docs/stable/rpc.html