Official DDP example is broken


I’m trying this example code with PT1.8.1 Distributed Data Parallel — PyTorch 1.10.1 documentation
I’m getting a ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_ADDR expected, but not set

What is wrong?

Thanks for bringing this up!

The issue is due to the docs, we need to set env variables MASTER_ADDR and MASTER_PORT to use the default env:// initialization correctly.

Fixing it in [Docs][BE] DDP doc fix by rohan-varma · Pull Request #71363 · pytorch/pytorch · GitHub.