How to load a model on single GPU that was trained on dataparallel?

ptrblck · July 15, 2020, 10:13am

It’s recommended to store the model.module.state_dict() for data parallel models as explained here.
If you’ve stored the model.state_dict(), all keys will include the .module keyword, which you would have to remove as described here.