RuntimeError: RPC has not been initialized. Call torch.distributed.rpc.init_rpc first

I am running the FedAvg simulation using the Pytorch RPC, but when I run it, the server side throw the errors. It seems that it is my coding problem, but I don’t know what is the problem. Here are some related code snippets:

.
.
.
    #Start training
    if args.rank == 0:
        for e in range(args.epoch):
            processes = []
            q = mp.Queue()
            print("Server's Epoch:"+str(e+1))
            weight = copy.deepcopy(model.state_dict())
            for r in range(args.world_size):
                p = mp.Process(
                    target=run_worker,
                    args=(
                        r,
                        model,
                        args.lr,
                        train_loader[r],
                        device,
                        args.epoch,
                        weight,
                        q))
                processes.append(p)
                p.start()

            for p in processes:
                p.join()
.
.
.

And for the function run_worker:

def run_worker(rank, model, lr, train_loader, device, epoch, weight, q):
    out_weight = rpc.rpc_sync(f"Worker{rank}", train, args=(rank, model, lr, train_loader, device, epoch, weight))
    q.put([rank, out_weight]) 

What is my main problem?

Error logs

Server initialized!
Server’s Epoch:1
Process Process-1:
Traceback (most recent call last):
File “/usr/lib/python3.9/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib/python3.9/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/home/pi/FYP/FedAvg_RPC.py”, line 96, in run_worker
out_weight = rpc.rpc_sync(f"Worker{rank}“, train, args=(rank, model, lr, train_loader, device, epoch, weight))
File “/usr/local/lib/python3.9/dist-packages/torch/distributed/rpc/api.py”, line 75, in wrapper
raise RuntimeError(
RuntimeError: RPC has not been initialized. Call torch.distributed.rpc.init_rpc first.
Process Process-2:
Traceback (most recent call last):
File “/usr/lib/python3.9/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib/python3.9/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/home/pi/FYP/FedAvg_RPC.py”, line 96, in run_worker
out_weight = rpc.rpc_sync(f"Worker{rank}”, train, args=(rank, model, lr, train_loader, device, epoch, weight))
File “/usr/local/lib/python3.9/dist-packages/torch/distributed/rpc/api.py”, line 75, in wrapper
raise RuntimeError(
RuntimeError: RPC has not been initialized. Call torch.distributed.rpc.init_rpc first.
Process Process-3:
Traceback (most recent call last):
File “/usr/lib/python3.9/multiprocessing/process.py”, line 315, in _bootstrap
self.run()
File “/usr/lib/python3.9/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/home/pi/FYP/FedAvg_RPC.py”, line 96, in run_worker
out_weight = rpc.rpc_sync(f"Worker{rank}", train, args=(rank, model, lr, train_loader, device, epoch, weight))
File “/usr/local/lib/python3.9/dist-packages/torch/distributed/rpc/api.py”, line 75, in wrapper
raise RuntimeError(
RuntimeError: RPC has not been initialized. Call torch.distributed.rpc.init_rpc first.

Minified repro

No response

Versions

Collecting environment information…
PyTorch version: 1.8.0a0+37c1f4a
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Debian GNU/Linux 11 (bullseye) (aarch64)
GCC version: (Debian 10.2.1-6) 10.2.1 20210110
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] (64-bit runtime)
Python platform: Linux-5.15.84-v8±aarch64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==0.812
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.19.5
[pip3] torch==1.8.0a0+37c1f4a
[pip3] torchvision==0.9.0a0+01dfa8e
[conda] Could not collect

Replied in the issue too, can you initialize the RPC package first by calling into rpc_init?

Fixed after the posting, I shutdown the RPC on server too early.