How to change DDP parameter 'find_unused_parameters'=True to False during training

jhp · August 31, 2021, 3:51pm

Hi I have a question about DDP.

When I use DDP with dynamically freezing or unfreezing my model’s parameters during the training, one option to make a non-error function is to use find_unused_parameters. But when there are no parameters that are unused during backpropagation, a warning occurs as below.

[W reducer.cpp:1050] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters, consider turning this flag off. Note that this warning may be a false positive your model has flow control causing later iterations to have unused parameters. (function operator())

Is there any way to disable find_unused_parameters during the process?

P.S) Using pytorch_lightning is not the option for me now

ptrblck · September 1, 2021, 5:17am

I’m not sure, if you could completely disable it, but based on the usage of TORCH_WARN_ONCE as seen here I would expect to see this warning only once or are you seeing it in each iteration?

jhp · September 1, 2021, 5:38am

Only once when I unfreeze the rest of the parameters.

By the way, does that also makes performance (i.e accuracy) worse?
I thought only the speed issue will occur, but the metric value drops significantly and then comes up again repeatedly. (or maybe my model is not stable )

ptrblck · September 1, 2021, 5:38am

No, the accuracy should not change only the speed as additional checks are needed.

jhp · September 1, 2021, 5:44am

Thanks for the reply. Right now I use add_param_group in optimizer for unfreezing parameters.
Hope that find_unused_parameters can dynamically updated later in Pytorch

dlsf · January 6, 2022, 7:48pm

This is very annoying, it should be possible to disable this.
What is the workaround to use DDP and only update a subset of parameters?

pritamdamania87 · January 7, 2022, 10:49pm

@dlsf Thanks for the feedback! Is your use case that you would like to train few iterations/epochs with find_unsued_parameters=True, then switch to find_unused_parameters=False for later iterations? If so, one option is to just create a new DDP instance with a different find_unused_parameters value for the remaining iterations.

stevelu · March 12, 2022, 2:35pm

My use case is to train a subset of the whole model. The model contains multiple modules. It update module parameters based on different input data. So how to use DDP for this case? find_unused_parameters=False will result into error.

JiamingLiu-Jeremy · August 11, 2022, 6:45pm

Hi ptrblck,

Thank you so much for your answer. I found setting “find_unused_parameters=True” will significantly lower the the training speed. Is this true in general ?

Thanks!

ptrblck · August 11, 2022, 8:29pm

A slowdown is expected and you might want to check if static_graph would work instead as it could potentially reduce the slowdown. From the docs:

Potentially improve performance when there are unused parameters, as DDP will not search graph in each iteraton to detect unused parameters when static_graph is set to be True. To check whether you can set static_graph to be True, one way is to check ddp logging data at the end of your previous model training, if ddp_logging_data.get("can_set_static_graph") == True, mostly you can set static_graph = True as well.