I new to both python and pyTorch and am trying to understand how multiprocessing works. The following code:
import torch
import torch.multiprocessing as mp
def test(q):
t = torch.normal(mean=0.0, std=1.0, size=(2, 3))
q.put(t)
if __name__ == "__main__":
mp.set_start_method("spawn", force=True)
q = mp.SimpleQueue()
processes = []
for _ in range(4):
p = mp.Process(target=test, args=(q,))
p.start()
processes.append(p)
for p in processes:
p.join()
while q.empty() == False:
print(q.get())
produces the following error:
Traceback (most recent call last):
File “/home/cadenmiller/Documents/coding/pyTorch/mp.py”, line 20, in
print(q.get())
File “/usr/lib/python3.9/multiprocessing/queues.py”, line 368, in get
return _ForkingPickler.loads(res)
File “/home/cadenmiller/.local/lib/python3.9/site-packages/torch/multiprocessing/reductions.py”, line 289, in rebuild_storage_fd
fd = df.detach()
File “/usr/lib/python3.9/multiprocessing/resource_sharer.py”, line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File “/usr/lib/python3.9/multiprocessing/resource_sharer.py”, line 86, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File “/usr/lib/python3.9/multiprocessing/connection.py”, line 507, in Client
c = SocketClient(address)
File “/usr/lib/python3.9/multiprocessing/connection.py”, line 635, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
I saw somewhere online that this is because the subprocess must be running until a get the tensor from the queue, but I have no idea how to do that. Any advice or suggestions would be great.