When I train a text generation model and then reload it in a separate session, I get abysmal performance. I figured out that the problem might be the vocabulary changing. Since everytime I start a session I build a new vocabulary from the training data, which is generated using random split from a Dataframe, the “stoi” mapping will be different every time.
Is there a way to save the vocabulary along with the model parameters to make sure that inference will be successful?