Hi, there. Recently I used multiple cpu cores for training. On my own PC, macbook 2017 (1 cpu 4 cores), I just set os.environ[‘MASTER_PORT’] as one single value and multiprocesses could run on the same server. However, when I migrated codes to the cluster in order to use more cores, I need to give a different value to os.environ[‘MASTER_PORT’] for each process. If not, the permission denied as below.
store = TCPStore(master_addr, master_port, world_size, start_daemon, timeout) RuntimeError: Permission denied
I don’t know much about the reason here, could someone explain it?