Which funciton will be lanuched mutiple times when I use init_process_group function?

Hi,

I have 4 GPUs for training.
CUDA_VISIBLE_DEVICES=0,1,2,3 WORLD_SIZE=4 python -m torch.distributed.launch --nproc_per_node=4 --master_port 49611 train.py
I know that spawn will initialize the function of its input parameter.
when I use the following code to initialize the training, which function will be launch multiple times?
when will local_rank 0, 1, 2 or 3 be set respectively?

Is there any doc/link about how init_process_group function initialize functions?

def trainer():
    # code 
	if args.distributed:
		torch.cuda.set_device(args.local_rank)
		device = torch.device('cuda:{}'.format(args.local_rank))
		torch.distributed.init_process_group(backend='nccl', init_method='env://')
		args.world_size = torch.distributed.get_world_size()
		args.rank = torch.distributed.get_rank()
	else:
		_logger.info('Training with a single process on 1 GPU.')
	assert args.rank >= 0
	```