Lightning: Loading a checkpoint when using json/yaml configs

JLenz · October 15, 2024, 8:35am

Hello! I’m running into an issue with the code I have for loading checkpoints. Within my wrapped lightning module I have a single nn model which is instantiated with a model config and tokenizer config. The model config is a .json file specifying various model hyperparameters and the tokenizer config is a python file that similarly defines the tokenizer characteristics.

Within pytorch lightning (pl) I use the common way of saving checkpoints:

class ModelTrainer(pl.LightningModule):
    def __init__(
        self,
        train_config: dict or str,
        tokenizer_config: str = None,
        model_config: str = None,
    ):
        super().__init__()
        self.save_hyperparameters()
        self.model: MyModel = MyModel(tokenizer_config, model_config)
        (rest of the code to load the model)

The problem arises when I try to instantiate it:

model: ModelTrainer = ModelTrainer.load_from_checkpoint(checkpoint)
It gives me an error from the init method of the model class itself (MyModel) at the part where it tries to actually load the json:

with open(model_config, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '.../model_configs/model_medium.json'

Prior to using PL I was using lucidrains pytorch custom utils for checkpoint saving/loading, which somehow avoided this issue altogether and is able to load the weights without attempting to read any of the json files:

(specifically the init_and_load functions)
But with the pl lightning module I have to use the pytorch model-saving framework. I guess my question here is, how can I instantiate the model from trained weights, while bypassing the need for the configs altogether?

Many thanks in advance!