Attention for sequence classification using a LSTM

Hello,

I am using a LSTM with word2vec features to classify sentences. In order to improve performance, I’d like to try the attention mechanism. However, I can only find resources on how to implement attention for sequence-to-sequence models and not for sequence-to-fixed-output models.

Thus, I have a few questions:

  1. Is it even possible / helpful to use attention for simple classifications?
  2. Is there a small working example on how to combine a simple LSTM with attention? I could not find something helpful. All the code I found is very complicated, uncommented and also for seq2seq.

Best,
Simon

2 Likes

Sure, you can use attention mechanism for the seq-2-one.

You can just imagine the seq-2-one is a special case in seq-2-seq. Attention mechanism just adjust the weights to the input features of decoder by the features, last output and last hidden of RNN (not necessary if decoder is not a RNN). This mechanism itself even don’t know you are doing seq-2-one or seq-2-seq task. It’s all up to you.

Did you see these examples? You can see them as the introductory tutorial.



2 Likes

Do you have any instructions on how to change any of these implementations for seq2one?

Here you go: https://github.com/chrisvdweth/ml-toolkit/blob/master/pytorch/models/text/classifier/rnn.py

I hope that helps. As the others suggested, it’s just the Seq2Seq example simplified.

1 Like