Safely removing a Module from DDP

mdlockyer · May 20, 2019, 11:27pm

Does anyone have any thoughts on the safest way to remove the DistributedDataParallel wrapper from a Module? Currently I’m just doing something like:

# Model in this case has already been wrapped in DDP
model = model.module

In the docs for DDP, it mentions hooks that are being registered in the module’s params:

when wrapping up your model with DistributedDataParallel, the constructor of
DistributedDataParallel will register the additional gradient
reduction functions on all the parameters of the model itself at the
time of construction

I take it those hooks are still there if I just grab the module attribute from the DDP instance right?

mdlockyer · June 5, 2019, 6:57pm

Based on a recent issue opened for PyTorch, it is in fact the case currently (v1.1.0) that the module will retain the reduction functions and new ones will be added each time the model is wrapped in DDP

pietern · June 24, 2019, 6:01am

This has been fixed and will be available in PyTorch 1.2 (and is already available in the nightly builds).