LSTM Learning, parameter size

Since you only posted the model but not the code for training, it’s difficult to say what’s going on. Here just some pointers that strike me as odd:

  • Are you using the whole data set as a single batch? Usually training is done in mini-batches, it’s typically better for training.

  • According to your comments, num_classes = 2, so self.fc(h_out) should return a shape of (batch_size, 2), in your case (50656, 2). Your error says something different, though.

  • I haven’t checked it in detail but h_out.view(-1, self.hidden_size) is probably wrong, at least if you increase num_layers. You generally cannot simply enforce a valid shape of (something, hidden_size) just so it works with self.fc. Have a look at this post.

1 Like