Hi All,
I have already running MPI program in the distributed model. I want to use those processes and run a PyTorch model distributedly. Is there a straightforward way to do this?
For instance, MPI_INIT() is already called from my programme, and I have a world_size of 4.
Is it possible to start a PyTorch distributed programme from this model?
Adding more information:
How to use the following option with Pytorch DistributedDataParallel model?
store(Store, optional): Key/value store accessible to all workers, used
to exchange connection/address information.
Mutually exclusive with init_method
.
I am trying to see how to map my existing MPI processes to launch a distributed data-parallel model in Pytorch using existing MPI instances.
Is there a way such that Pytorch distributed mode can consume existing MPI processes?