I have a question about the architecture of distributed PyTorch!
When I run some examples, I saw that we can send and receive directly from worker A to worker B.
Why do we need MASTER_PORT and MASTER_ADDRESS?
For port, we can understand that they need this number to recognize other workers which belong to the same program or not. However, I do not understand why we need master_add?
if it is a Master-Slave model, I think that is no problem, and Master worker will manage all works.