I know this problem have been addressed many times but I cannot find any answers so I’m trying again. I’m building a LSTM classifier to predict a class based on a text. The issue is that my validation accuracy stagnate around 35%.
I’m wondering if it’s my model or my data prepation which is not working. Does my model looks correct for you or do I miss something? Thanks !
Pro tip: You don’t have to intialize the hidden state to 0s in LSTMs. PyTorch does that automatically.
That network looks fine imo. Maybe try changing the embedding size, stacked layers, and input_size.
You’re passing the hidden layer from the last rnn output. Instead you can using the output value from the last time step. Or maybe average out all the hidden/output values.
From what I can see from having a quick look at your code:
The shape of x after self.emb(x) should be (batch_size, seq_len, embed_dim)
So batch_size = x.size(1) gives you the wrong value
In your code you NOT use batch_first=True, but the batch is in fact the first dimension x (this is probably the reason why the network throws no error despite the wrong batch_size)
Thanks for the tips and the recommendations, my training error is also not increasing. Basically, around 20 epochs, i have the following performances and it will never go beyond:
Epoch 0 | Train loss 1.587 | Train acc 0.263 | Valid loss 1.575 | Valid acc 0.280
Epoch 27 | Train loss 1.073 | Train acc 0.553 | Valid loss 1.596 | Valid acc 0.357
I tried to change the HP but nothing help.
@vdw thanks for the solutions, I have done the changes but performances stay the same. I’ll dig into that, maybe my data are in the wrong shape.