How is the RPC framework different from what this tutorial shows (Writing Distributed Applications with PyTorch — PyTorch Tutorials 1.7.1 documentation)? That uses send
/recv
and all_reduce
etc… there are so many options that this confusing and frustrating. I don’t understand why (I understand the error by why can’t it just do rpc.map
or something for me and allow me to use gradients)
with Poo(100) in pool:
losses = pool.map(forward, batch)
torch.mean(losses).backward()
optimizer.step()
works…
My intetion was to parallelize meta-learning with torchmeta + higher but it seems that path is dead using DDP until higher is incorporated into the core of pytorch. See:
but the RPC path might not be dead: