In the DDP tutorial, the author user 'multi-node ’ In one computer, I don’t understand why using it, I want to know when should I use multi-node?
Could you post a link to the tutorial?
A node usually refers to a host, so that I’m not sure what multi-node on a single computer means.
I guess that the author just want to show how to use multi-node, and let us generize to another enviroment?
You can replace it with “multi-process” and it will still be valid. It’s common to use a single PyTorch process per GPU device in your system. Running 8 processes across 8 machines won’t be different from running 8 processes on a single machine (provided it has 8 GPUs), except for performance.