If set random seed across multi-gpu is necessary in DistributedDataParallel?

Following imagenet-example: https://github.com/pytorch/examples/blob/master/imagenet/main.py, It seems that seed is not set in default (default is None):
parser.add_argument('--seed', default=None, type=int, help='seed for initializing training. ')

But when we use DistributedDataParallel mode, if seed is not set, the initialized parameters across multi-gpu will be different, resulting in different model param is kept in different gpus during training process (although we only save ckpt in rank0 gpu).

I am not sure whether this phenomenon will cause unknown errors, or may lead to an unstable results? Is it safe for me not to set the initialization seed?

This should be fine, because DistributedDataParallel broadcasts model states from rank 0 to all other ranks at construction time. See the code below: