Train multiple models on multiple GPUs

iffiX · June 1, 2020, 3:50am

MPI is not necessary here, torch.distributed package now provides MPI style and rpc style distributed apis. Moreover it also supports gloo mpi and nccl backends (MPI style only), so if you don’t want more hassles, they should be sufficient.