Running a model from a loaded state_dict without model's code definition

Hi there,

Say, I download a trained model (g.e. http://ml.cs.tsinghua.edu.cn/~chenxi/pytorch-models/mnist-b07bb66b.pth, it’ll download if you click).

It contains the state_dict of the model. The question is, how do I run the model without having access to its code definition as in here https://github.com/aaron-xichen/pytorch-playground/blob/master/mnist/model.py? Is it possible?

The generalized question might sound like “How to run a trained model if you have its state_dict only?”.

Thanks for help!

You won’t be able to run the model as the state_dict only contains its parameters and buffers.
In PyTorch “eager” mode the model definition is the source code of it, which thus would need to be accessible.
You can script the model and store the scripted model, which would be executable without the model definition.

1 Like

Do you mean saving and loading the entire model like here https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_models_for_inference.html#save-and-load-entire-model?

Is it practically possible to parse the state_dict to reconstruct the code for the model? … and then just load the parameters from state_dict.

No, I wouldn’t use this approach as you would have to make sure all source files are in the expected locations and your loading might easily break.
The introduction section explains it a bit better.

I was referring to torch.jit.save.

1 Like

If you have an idea how the model would look like, e.g. were all parameters called sequentially, you might hack a model definition. Since the forward method is unknown, I don’t think it’s a practical approach.

1 Like

Can you please clarify this?

By “eager” I mean the normal mode without using the JIT.

1 Like