Is there a way to have rank 0 and 1 send a message to each other at the same time and then recv the message sent at the same time without resulting in deadlock?
I’ve tried the following code
reqs = []
if rank == 0:
neighbour = 1
if rank ==1:
neighbour = 0
reqs.append(dist.isend(tensor=send_tensor, dst=neighbour))
reqs.append(dist.irecv(tensor=recv_tensor, src=neighbour))
for req in reqs:
req.wait()
But as you can imagine, the first “req” in the “reqs” array is the send request for both processors. And both end up waiting on the other one to receive, resulting in deadlock.
Of course you can do send->recv in rank 1 and recv->send in rank 2. But this would take twice the amount of time.
In MPI there is a wait_all function such that the requests don’t have to be fulfilled in a predetermined order to prevent the deadlock. But there isn’t one in pyTorch.
This seems a very simple thing to do but I couldn’t figure out how to do it. Any help would be appreciated.
takes twice amount of time as doing 1 isend as the following:
if rank == 0:
reqs.append(dist.isend(tensor=send_tensor, dst=1))
else:
reqs.append(dist.irecv(tensor=recv_tensor, src=0))
for req in reqs:
req.wait()
Looking at this github issue It does seem like the batch_isend_irecv() is supposed to support concurrent send/recv. So I don’t know what I’m missing here.