An error with torch.distributed: RANK expected, but not set

Hi,

Today I try the DDP to forward my module, But I found this error.

When I except this code:

>>> import torch
>>> torch.distributed.is_nccl_avilable()
True
>>> torch.distributed.init_process_group(backend='nccl')

there is an error like this:

ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set

How to deal it.

I found a good way to deal this error:

as follows:

python -m torch.distributed.launch mytestfile.py

but why?

2 Likes