Hello everyone,
I have a model that does a text generation on character level.
So far, I have been saving and loading the training models like this:
torch.save(model, 'model')
model = torch.load('model')
model.eval()
and it has been fine.
Now I am trying to do the same thing using the state_dict because it is more convenient but the text being generated during inference looks like a complete random text without any meaning.
The way I save and load the state dict is:
torch.save(model.state_dict(), 'model_rnn.pt')
model = RNN(tokens=chars, hidden_size=hidden_size, n_layers=n_layers, dropout_prob=dropout_prob).to(device)
model.load_state_dict(torch.load('model_rnn.pt', map_location='cpu'))
model.eval()
I know that setting the model for evaluation mode is important because dropout shouldn’t be used during eval and I have that.
Here is some example of the text when I save and load the whole model:
Ти ми стои верност се појаваш
сам си ми се проклето сега
и пак да стави светот на порти
and here is an example with state_dict:
о2Е’лС’кокоЊХсьоTьiпоМьВьоџоЊьоЏХ’к’TP2СДоСьоЊџоЊzп́’z’PоTџz’оT’Э=ДЦДP2K’лоШДzьоTьоШХ’лџокьzџШльС2Сџд
I know most of you can’t read Cyrillic but still it is easy to see the difference
Does anyone know what might be the problem?