Using `torch.distributed.all_gather_object` returns error when using 1 GPU but works fine for multiple GPUs

I’m currently using HuggingFace Accelerate to run some distributed experiments and have the following code inside of my evaluation loop:

device = accelerator.device
intermediate_value = {}
output = [None] * accelerator.num_processes

# Some evaluation code.

dist.all_gather_object(output, intermediate_value)

When I’m using multiple GPUs it’s fine, but when I’m using only one I get the following error:

RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

What I’m wondering is, I thought that if you wrap your model, optimizer, etc. using the HuggingFace Accelerate module then you didn’t have to do torch.distributed.init_process_group? And if this is the case, then how come it’s not working when I only have 1 GPU?

Thanks in advance.

This question seems to be HuggingFace-specific so you might want to post it in their discussion board.

1 Like