Loading DP models in non-DP scenarios?

Qwwq · April 29, 2022, 6:14am

This is regarding saving and loading of models that are wrapped with DataParallel. Let’s say I wrap a model with DataParallel during training and save the checkpoints at some intervals. After that when I load the saved checkpoints in some other project that does not use DP, I get the error that the state_dict cannot be loaded because there is a mismatch between the saved and the existing state_dicts’ keys. Apparently there is a module. prefix that prohibits the use of the save checkpoint. This is because DP wraps the model by the aforementioned module. Moving the model to cpu before saving also does not resolve this issue. The state_dict still contains the module wrapper. How can I resolve this issue? Ideally I would love to have the model saved without the module. wrapper.

ptrblck · April 29, 2022, 8:11pm

You could save the state_dict via torch.save(model.module.state_dict(), path) as described here.