Attention for sequence classification using a LSTM


I am using a LSTM with word2vec features to classify sentences. In order to improve performance, I’d like to try the attention mechanism. However, I can only find resources on how to implement attention for sequence-to-sequence models and not for sequence-to-fixed-output models.

Thus, I have a few questions:

  1. Is it even possible / helpful to use attention for simple classifications?
  2. Is there a small working example on how to combine a simple LSTM with attention? I could not find something helpful. All the code I found is very complicated, uncommented and also for seq2seq.



Sure, you can use attention mechanism for the seq-2-one.

You can just imagine the seq-2-one is a special case in seq-2-seq. Attention mechanism just adjust the weights to the input features of decoder by the features, last output and last hidden of RNN (not necessary if decoder is not a RNN). This mechanism itself even don’t know you are doing seq-2-one or seq-2-seq task. It’s all up to you.

Did you see these examples? You can see them as the introductory tutorial.


Do you have any instructions on how to change any of these implementations for seq2one?

Here you go:

I hope that helps. As the others suggested, it’s just the Seq2Seq example simplified.

1 Like