What is the proper way to use torch distributed 2+ times on a single node?

Launching a job for DistributedDataParallel using torch.distributed.launch works fine on the first time. On the second time, I get RuntimeError: Address already in use.

I’ve tried modifying MASTER_ADDR, but I get RuntimeError: Connection timed out. What is the proper way to make sure the distributed jobs do not collide?

Since localhost is okay, it turns out changing the port only (MASTER_PORT) was sufficient.