Hello, I’m trying to share a same Tensor between two Processes.
In this case, the Tensor will be allocated in shared cache pool to be able to get accessed from both.
import torch import torch.multiprocessing as mp def sub_process(queue): tensor = queue.get() # do something... tensor = None tensor = queue.get() # do something... tensor = None if __name__ == '__main__': mp.set_start_method('spawn') recv_queue = mp.Queue() t1 = torch.rand(2, 2, device="cuda") p = mp.Process(name="sub_process",target=sub_process, args=(recv_queue, )) p.start() recv_queue.put(t1) # update t1 ... recv_queue.put(t1) p.join()
What I expected was, when I call the second
queue.get() function in sub_process, it would take far less time than the first
queue.get() since it will allocate the tensor in cached memory pool using storage_from_cache.
However, I found there exists a cached storage in shared_cache (line 298 in reductions.py), but when _new_with_weak_ptr is called to use the storage, it returns None.
The storage cache works as expected only when I control the object manually without calling
tensor = None.
Should I handle all objects one by one to use the shared cache pool, or there is a better way?
While debugging, I came up with another question.
It seems like Senders also make the Storage Cache,
I don’t think I’m understanding the purpose of that code, since I could not find where it actually uses that Storage Cache; all uses Storage Cache from Receivers.