How to use Distributed Pytorch on multiple Machines

Sur · April 16, 2018, 8:49am

Hi,

I am trying to run Distributed pytorch on multiple machines. From the Documentation, I can see that we can define MASTER ADDRESS and its port. It is not clear that how can we define which worker nodes to be used for computation, like we can do in tensorflow using ClusterSpecs ?

-----Is there any way similar to tensorflow clusterspecs in Pytorch, where we can define the nodes to be used in computation.

Thanks,
Surbhi

Sur · April 16, 2018, 10:41am

Any suggestions --------------------- ?