I’ve trained a model that is to be used for text generation. I’m still kind of new to saving and loading models for inference so, for experience’s sake I coded two versions. One where I train the model, save it, and directly generate text, and another where I load the saved model and generate the text from that point.
The first version generates text as I wanted it to, however the text generated from the loaded model is just gibberish. I noticed that the softmax results returned from the loaded model are all near equal as well…
This is how I save the model:
model_name = 'model.pth'
checkpoint = {'n_hidden': net.n_hidden,
'n_layers': net.n_layers,
'state_dict': net.state_dict(),
'tokens': net.chars}
with open(model_name, 'wb') as f:
torch.save(checkpoint, f)
And this is how it is loaded:
model_path = 'model.pth'
model = charRNN(chars, n_hidden, n_layers, lr, dropout)
checkpoint = torch.load(model_path)
model.load_state_dict(checkpoint['state_dict'])
model.eval()