Distributed Pytorch with Existing MPI Processes

Vibhatha_Abeykoon · November 3, 2019, 11:45pm

Hi All,

I have already running MPI program in the distributed model. I want to use those processes and run a PyTorch model distributedly. Is there a straightforward way to do this?

For instance, MPI_INIT() is already called from my programme, and I have a world_size of 4.
Is it possible to start a PyTorch distributed programme from this model?

Adding more information:

How to use the following option with Pytorch DistributedDataParallel model?

store(Store, optional): Key/value store accessible to all workers, used
to exchange connection/address information.
Mutually exclusive with init_method.

I am trying to see how to map my existing MPI processes to launch a distributed data-parallel model in Pytorch using existing MPI instances.

Is there a way such that Pytorch distributed mode can consume existing MPI processes?