Saving and reloading model: save and reload vocab as well?

When I train a text generation model and then reload it in a separate session, I get abysmal performance. I figured out that the problem might be the vocabulary changing. Since everytime I start a session I build a new vocabulary from the training data, which is generated using random split from a Dataframe, the “stoi” mapping will be different every time.

Is there a way to save the vocabulary along with the model parameters to make sure that inference will be successful?

If you are using random, I suggest you look here:

I think it will allow you to have the same random split at all times.

Thanks for replying. I am using the following function to set the random seed at the beginning of every session:

    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    os.environ['PYTHONHASHSEED'] = str(seed)

Nevertheless, the issue remains.