I’m confused about how to set the seed in torch DDP. I know that the models must be initialized to the same parameter values across processes, so the seed must be the same.
Yes, in that case models on each rank would be initialized with different values.
However at startup time DDP broadcasts model parameters from rank 0 to ensure all ranks start training with the same model params, so setting a seed is not needed, unless you want determinism across different training runs with regard to the model params.