Changing values of model's parameters after wrapped with DDP

In DDP docs, there is a note that warns the user should never try to change the model’s parameters

You should never try to change your model’s parameters after wrapping up your model with DistributedDataParallel . Because, when wrapping up your model with DistributedDataParallel , the constructor of DistributedDataParallel will register the additional gradient reduction functions on all the parameters of the model itself at the time of construction. If you change the model’s parameters afterwards, gradient reduction functions no longer match the correct set of parameters.

I wonder if it is allowed to just change the values of the model parameters after wrapping the model with DDP.

Changing the values would create diverged parameter sets between the ranks unless you guarantee that all ranks apply exactly the same manipulation. I would still be careful in making sure not to replace the parameters.

1 Like