I use PyTorch Lightning for saving checkpoints. Unfortunately when I try to load a model from a checkpoint I get a size missmatch of the weight tensors. This is because the model.load_from_checkpoint() method instantiates a new model object but fails to infer the correct hyperparameters. Thus the layers have the wrong size when trying to load the weights.
I was wondering whether (and how) to load the weights saved in the checkpoint into a manually pre-instantiated model?
Because I use different models within the same script based on a CLI Argument passing the hyperparameters to the load_from_checkpoint() method would lead to either a bunch of duplicated code or alternatively a substantial amount of refactoring I’d like to avoid.
I’m grateful for any suggestions, if you need further information pls ask.