Hi,
I’m trying this example code with PT1.8.1 Distributed Data Parallel — PyTorch 1.10.1 documentation I’m getting a ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_ADDR expected, but not set
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_ADDR expected, but not set
What is wrong?
Thanks for bringing this up!
The issue is due to the docs, we need to set env variables MASTER_ADDR and MASTER_PORT to use the default env:// initialization correctly.
Fixing it in [Docs][BE] DDP doc fix by rohan-varma · Pull Request #71363 · pytorch/pytorch · GitHub.