Running a model from a loaded state_dict without model's code definition

satbek · September 24, 2020, 5:04am

Hi there,

Say, I download a trained model (g.e. http://ml.cs.tsinghua.edu.cn/~chenxi/pytorch-models/mnist-b07bb66b.pth, it’ll download if you click).

It contains the state_dict of the model. The question is, how do I run the model without having access to its code definition as in here https://github.com/aaron-xichen/pytorch-playground/blob/master/mnist/model.py? Is it possible?

The generalized question might sound like “How to run a trained model if you have its state_dict only?”.

Thanks for help!

ptrblck · September 24, 2020, 8:06am

You won’t be able to run the model as the state_dict only contains its parameters and buffers.
In PyTorch “eager” mode the model definition is the source code of it, which thus would need to be accessible.
You can script the model and store the scripted model, which would be executable without the model definition.

satbek · September 24, 2020, 8:08am

Do you mean saving and loading the entire model like here https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_models_for_inference.html#save-and-load-entire-model?

satbek · September 24, 2020, 8:10am

Is it practically possible to parse the state_dict to reconstruct the code for the model? … and then just load the parameters from state_dict.

ptrblck · September 24, 2020, 8:11am

No, I wouldn’t use this approach as you would have to make sure all source files are in the expected locations and your loading might easily break.
The introduction section explains it a bit better.

I was referring to torch.jit.save.

ptrblck · September 24, 2020, 8:12am

If you have an idea how the model would look like, e.g. were all parameters called sequentially, you might hack a model definition. Since the forward method is unknown, I don’t think it’s a practical approach.

satbek · September 24, 2020, 8:14am

Can you please clarify this?

ptrblck · September 24, 2020, 8:16am

By “eager” I mean the normal mode without using the JIT.