LSTM error when using a saved model for prediction

mr_easy · April 1, 2020, 9:17am

I trained a char-LSTM model for generating text. I saved it using torch.save(). And loading it using torch.load(). But when using the loaded model for prediction, I am getting the following error:

AttributeError: 'LSTM' object has no attribute '_flat_weights'

My model is defined as:

class charGen(nn.Module):
    
    def __init__(self, n_letters, lstm_size, lstm_layers=3, lstm_dropout=0, dropout=0, hidden_dim=128):
        super(charGen, self).__init__()
        
        self.n_letters = n_letters
        self.lstm_size = lstm_size
        self.lstm_layers = lstm_layers
        self.lstm_dropout = lstm_dropout
        self.dropout = dropout
        self.hidden_dim = hidden_dim

        self.lstm = nn.LSTM(n_letters,
                            lstm_size,
                            num_layers=lstm_layers,
                            batch_first=False,
                            dropout=lstm_dropout)
        self.dropout = nn.Dropout(p=dropout)
        self.relu = nn.ReLU()
        self.fc = nn.Sequential(
                        nn.Linear(self.lstm_size, self.hidden_dim),
                        self.relu,
                        self.dropout,
                        nn.Linear(self.hidden_dim, self.hidden_dim),
                        self.relu,
                        self.dropout,
                        nn.Linear(self.hidden_dim, self.hidden_dim),
                        self.relu,
                        self.dropout,
                        nn.Linear(self.hidden_dim, self.n_letters)
                    )
        
    def forward(self, x, prev_states):
        out, state = self.lstm(x, prev_states)
        fc_in = out.view(-1, out.size(2))
        fc_out = self.fc(fc_in)
        return fc_out, state

    def zero_state(self, batch_size):
        return [torch.zeros(self.lstm_layers, batch_size, self.lstm_size), 
                torch.zeros(self.lstm_layers, batch_size, self.lstm_size)]

I can’t understand what’s going wrong. Need help.

EDIT: I am running the training and model saving code on a different machine (on my university’s server) with pytorch version ‘1.0.1’ while the code the load and use it for inference is done on my system. with pytorch version ‘1.4.0’. Can that be a reason?

mr_easy · April 1, 2020, 2:57pm

I got this workaround. I don’t know what the problem is and how to correctly resolve it. But just posting it.

Since the weights are already there in the loaded_model, so I copied those weights into another model that I created using same arguments as in the training time:

model = charGen(n_letters, 512, hidden_dim=512).to(device)  # This is exactly how I had my model during training
model.load_state_dict(loaded_model.state_dict())

And it worked now.

predict(model, "i am happy ", device)
#'{i am happy }and away along in the act and finish a straightforward battle on the starboard, and starts to the ope'