Partial state load (models)

Right now, the model structure needs to match exactly the saved state when using load_state_dict().
It would be useful to introduce an optional argument, say allow_missing_keys so that the function doesn’t throw when unexpected keys are present.
The use case is models that gets extended: as training takes a long time it’s useful (and sometimes necessary for convergence) to train a subset of the model, then add some new components that are randomly initialized, and resume training from there (either of the whole model or only of the additional components).
Right now to achieve this result it’s necessary to call state_dict() on the extended model, merge, and then use load_state_dict(). This is not particularly elegant nor memory friendly for big models.

1 Like

You can still do so.

Create a new model, with a subpart having the same model. As long as subparts structure matched old model, read it’s dict file for part 1. Easily doable.