Code Review - Sentiment Analysis [Beginner]

vdw · April 28, 2021, 1:43am

Since you say you’re a beginner, I would start with a simpler model. You currently use hidden states from all time steps. While you can do this in principle, you might run into problems when using packing since it will update the hidden states only until the length of the sequence.

For example, if you have a batch with the longest sequence being 20 and a sequence S in the batch, the last 5 hidden states of S won’t be meaningful, because the LSTM stopped for S at time step 15. You still feed all 20 time steps into the next linear layer.

The safer bet – particularly in the beginning – is to use hidden after

output,hidden = self.lstm_cell(packedSeq,hidden)

hidden contains the LAST hidden state for each sequence. Its shape is (num_layers * num_directions, batch, hidden_size). Since you define your LSTM layer with num_layers=1 and bidirectional=False (default values), you can simply do

hidden = hidden[-1]

To get the final tensors of shape (batch, hidden_size), i.e., the last hidden state for each sequence. Of course, you need to change your linear layers, e.g., self.lf = nn.Linear(hidden_size, hidden_size/2), etc.

Lastly, be very careful with reshape() or view():

reshaped_out = outputt.reshape(outputt.size()[0],outputt.size()[1]*outputt.size()[2])

When not being careful, it can quickly mess up your data. You might want to check this post.