Apart from all you question, I don’t see that you initialize the hidden state of you LSTM layer in each iteration; see this post. There should be something like:
for i, inputs in enumerate(X_train):
# Initialize hidden layer
model.hidden = model.init_hidden(batch_size)
...
and you model class having a method init_hidden
like
def init_hidden(self, batch_size):
return (torch.zeros(self.num_layers * self.directions_count, batch_size, self.rnn_hidden_dim).to(self.device),
torch.zeros(self.num_layers * self.directions_count, batch_size, self.rnn_hidden_dim).to(self.device))
I just copied from my code, so you would need to adopt it to your requirements.