Time out while I use dist.barrier()

error report:
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

brief
I found that dist.barrier() can not wait for long time while use **nccl** backend.

I try to modify like this:
dist.init_process_group(backend, rank=rank, world_size=size,timeout=timedelta(seconds=2000))
but it does not work while use nccl backend.
If I use dist.barrier() to pause a process longer than 7 seconds ,above error occur.

code

def run(rank,size ): 
    idx=0
    while True:
        dist.barrier()
        if rank==0:
            idx+=1
            time.sleep(8) ##Simulation calculation process, while duration time >7s ,Error occur######

        print(f'sync:{idx}')
        dist.barrier()

def init_processes(rank, size ,fn, backend='nccl'):
    """ Initialize the distributed environment. """
    os.environ['MASTER_ADDR'] = '127.0.0.1'
    os.environ['MASTER_PORT'] = '29500'  # ##
    os.environ['NCCL_ASYNC_ERROR_HANDLING ']='1'
    dist.init_process_group(backend, rank=rank, world_size=size,timeout=timedelta(seconds=2000))
    fn(rank, size)

if __name__ == '__main__':    
    multiprocessing.set_start_method("spawn")
    size = 3  
    processes = []
    for rank in range(size):
        p = Process(target=init_processes, args=(rank, size,run))
        p.start()
        processes.append(p)

    time.sleep(0.05)
    processes.append(p)
    for p in processes:
        p.join()