Hi,
I have a use case, where I have two processes, one for execution in GPU and another one is responsible for CPU execution. I want to send PyTorch tensors from GPU to CPU asynchronously, i.e., the data transfer to the CPU process will not block the workflow on GPU-hosted process.
Is this idea viable? If yes, is torch.multiprocessing library has such data structure that allows me to share data from GPU to another process in CPU?
Thanks.