How to load model weights that are stored as an ordereddict?

Hello. I’m not sure if I’m just unfamiliar with saving and loading Torch models, but I’m facing this predicament and am not sure how to proceed about it.

I’m currently wanting to load someone else’s model to try and run it. I downloaded their pt file that contains the model, and upon performing model = torch.load(PATH) I noticed that model is a dictionary with the keys model, opt, and optims.

I’ve never really seen that before, but I figured that the person saved the optimizer as well as the model’s weights. I assigned model['model'] to another variable, but when I tried to run the model via a output = model2(input) call I got a TypeError: 'collections.OrderedDict' object is not callable and realized that the weights were stored as an ordereddict.

How might I proceed to achieve what I want? I don’t have much experience with loading and using pre-trained models, but something tells me it should be relatively straightforward.

1 Like

The stored checkpoints contains most likely the state_dicts of the model and optimizer.
You would have to create a model instance first and load the state_dict afterwards as explained in the serialization docs:

model = MyModel()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) # change to whatever optimizer was used

checkpoint = torch.load(path_to_checkpoint)
model.load_state_dict(checkpoint['model'])
optimizer.load_state_dict(checkpoint['opt'])

Note that you need the model definition to create the model instance.

5 Likes

Thanks @ptrblck . How to get the model definition? I have OrderedDict for pre-trained LayouLM model. I dont know the model definition. I have one config.json file and one args.bin file with me. I have just started learning pytorch so I dont have much idea about it.

I’m not sure how the config.json and args.bin files are used, but you would need to use the model source code in order to create an instance of it before loading the state_dict.
How are you creating an object of the model at the moment?

I am trying to load pre-trained LayoutLM model. I have downloaded the pre-trained model from https://drive.google.com/open?id=1Htp3vq8y2VRoTAwpHbwKM0lzZ2ByB8xM. It has only the state_dict and not the model itself.
I am using the classes given on the following link for model definition transformers.models.layoutlm.modeling_layoutlm — transformers 4.5.0.dev0 documentation
Am I on the right path?

The latter link seems to point to the model definition, so you might create an object using:

model = LayoutLMModel(...)

and load the state_dict via:

model.load_state_dict(torch.load(PATH))
1 Like

@ptrblck. Following your suggestions,
I have tried doing
model = LayoutlmModel.from_pretrained(“microsoft/layoutlm-base-uncased”)
then
model.load_state_dict(torch.load(‘pytorch_model.bin’, map_location=‘cpu’))

Following error comes
RuntimeError: Error(s) in loading state_dict for LayoutlmModel:
Missing key(s) in state_dict: “embeddings.word_embeddings.weight”, “embeddings.position_embeddings.weight”, “embeddings.x_position_embeddings.weight”, “embeddings.y_position_embeddings.weight”, …

but both the keys in state_dict(pytorch_model.bin) and the keys in the model are same then why this error is shown?

PS : model = LayoutlmModel.from_pretrained(“microsoft/layoutlm-base-uncased”,state_dict=torch.load(‘pytorch_model.bin’, map_location=‘cpu’)). This has worked for me.
Thank you @ptrblck . Your suggestions were valuable.

1 Like