LSTM Learning, parameter size

vdw · April 7, 2021, 8:06am

Since you only posted the model but not the code for training, it’s difficult to say what’s going on. Here just some pointers that strike me as odd:

Are you using the whole data set as a single batch? Usually training is done in mini-batches, it’s typically better for training.
According to your comments, num_classes = 2, so self.fc(h_out) should return a shape of (batch_size, 2), in your case (50656, 2). Your error says something different, though.
I haven’t checked it in detail but h_out.view(-1, self.hidden_size) is probably wrong, at least if you increase num_layers. You generally cannot simply enforce a valid shape of (something, hidden_size) just so it works with self.fc. Have a look at this post.