For single node, I set
os.environ['MASTER_ADDR'] = 'localhost' os.environ['MASTER_PORT'] = '29500'
and the size is as input parameter.
However, with multiple nodes, we have to set differently. But I did now know how to set it?
For example, I know the node names with 4 nodes as below.
C1-01 C1-02 C2-01 C2-02
When I submit the job, the node names will change.
How to set MASTER_ADDR for the program?