Does anyone have any thoughts on the safest way to remove the DistributedDataParallel wrapper from a Module? Currently I’m just doing something like:
# Model in this case has already been wrapped in DDP model = model.module
In the docs for DDP, it mentions hooks that are being registered in the module’s params:
when wrapping up your model with DistributedDataParallel, the constructor of
DistributedDataParallel will register the additional gradient
reduction functions on all the parameters of the model itself at the
time of construction
I take it those hooks are still there if I just grab the module attribute from the DDP instance right?