Correct usage of torch.distributed.run (multi-node multi-gpu)

rvarm1 · July 11, 2021, 8:05pm

Also, IIUC, torch.distributed.run should be fully backward-compatible with torch.distributed.launch. Have you tried simply dropping in torch.distributed.run with the same launch arguments, and if so what sort of issues did you hit there?