Storing large amount of similar models

Hello everyone,

I’m reposting this issue as I’ve submitted it to the wrong forum previously.

While doing some research using PyTorch in the last couple of months I’ve been facing a conceptual problem related to storing variations of models to which I cannot find a satisfying answer (excuse me if I missed an obvious resource but my search has lead to nothing).

If I create a model and store it’s parameters I can recreate the model by loading the parameters in the same model. However, if I slightly vary the model to investigate new possibilities, this will change the source model for the previous network, meaning that I am now obliged to store the source code (99% equal) twice, one for each variation if I wanna test/compare the models later. When this is repeated for dozens or hundreds of variations, the code redundancy becomes problematic due to the sheer amount of versions for similar networks. Additionally if the weights of a network are serialized but the source for the model is not saved and then subsequently changed, the retrieval of the model will require some guessing of the original model.

If I understand correctly, the protocol buffers available in Tensorflow help in easing this problem, but no similar technique is available in PyTorch.

I would like to question if anyone uses a work around for this issue, or if I am missing something obvious and it’s not really an issue? Also, what would be a correct way to tackle it, in the case that no solution is widely used yet?

Thank you very much,

Is the code redundancy a large problem for you? I would (if you’re iterating over models) generate a source file for each model, run the model, and then save the model into a file that you can quickly (by name) associate with the source file. Having a good organization scheme (folders, filenaming) can help in working with the redundancy.

Having copies of each source file for each model means that loading can go off without a hitch and you can access the right parts of each particular model correctly.

I don’t know enough about how the protocol buffers in Tensorflow to understand how they alleviate this problem: does it store source code (or at least, a representation of what the serialized weights are?) along with the serialized model?

An alternative approach might be to write a model generator that generates a model based on a string description. The string description could be stored with the weights.

Thank you very much for both the replies;

The code redundancy is annoying due to how some properties of the model must be changed for all models and some are model-dependant. However, in the end I am iterating different models and only a select few require to be stored long-term, and as so both of your suggestions may solve my problem. I will try to resolve my problem by storing each serialized weight table alongside the corresponding source code, and adapt the evaluation script to automatically load the corresponding model in the same folder.

Generating a model from a string description was my first attempt but I’ve run into some problems in implementation flexibility for less standard models (for example if they weren’t linear), however I’m sure it is achievable with some dedication, but I wondered if no standard solutions were available, hence this post.

Again, thank you very much for the help,

Hey,

If anyone is facing the same problem I ended up using some spare time to develop a tool that I could use to solve this exact problem.

You can find it, modify it and use it at will here:

It is still at an early phase in development so all suggestions/bugs are appreciated. Please use cautiously for now at is is not thoroughly tested.