I’m reposting this issue as I’ve submitted it to the wrong forum previously.
While doing some research using PyTorch in the last couple of months I’ve been facing a conceptual problem related to storing variations of models to which I cannot find a satisfying answer (excuse me if I missed an obvious resource but my search has lead to nothing).
If I create a model and store it’s parameters I can recreate the model by loading the parameters in the same model. However, if I slightly vary the model to investigate new possibilities, this will change the source model for the previous network, meaning that I am now obliged to store the source code (99% equal) twice, one for each variation if I wanna test/compare the models later. When this is repeated for dozens or hundreds of variations, the code redundancy becomes problematic due to the sheer amount of versions for similar networks. Additionally if the weights of a network are serialized but the source for the model is not saved and then subsequently changed, the retrieval of the model will require some guessing of the original model.
If I understand correctly, the protocol buffers available in Tensorflow help in easing this problem, but no similar technique is available in PyTorch.
I would like to question if anyone uses a work around for this issue, or if I am missing something obvious and it’s not really an issue? Also, what would be a correct way to tackle it, in the case that no solution is widely used yet?
Is the code redundancy a large problem for you? I would (if you’re iterating over models) generate a source file for each model, run the model, and then save the model into a file that you can quickly (by name) associate with the source file. Having a good organization scheme (folders, filenaming) can help in working with the redundancy.
Having copies of each source file for each model means that loading can go off without a hitch and you can access the right parts of each particular model correctly.
I don’t know enough about how the protocol buffers in Tensorflow to understand how they alleviate this problem: does it store source code (or at least, a representation of what the serialized weights are?) along with the serialized model?
An alternative approach might be to write a model generator that generates a model based on a string description. The string description could be stored with the weights.
The code redundancy is annoying due to how some properties of the model must be changed for all models and some are model-dependant. However, in the end I am iterating different models and only a select few require to be stored long-term, and as so both of your suggestions may solve my problem. I will try to resolve my problem by storing each serialized weight table alongside the corresponding source code, and adapt the evaluation script to automatically load the corresponding model in the same folder.
Generating a model from a string description was my first attempt but I’ve run into some problems in implementation flexibility for less standard models (for example if they weren’t linear), however I’m sure it is achievable with some dedication, but I wondered if no standard solutions were available, hence this post.