Attention for LSTMs

PTA · September 2, 2018, 5:55am

I wanted to enable attention for my multi-to-one LSTM model. However, I did not find any attention layers in the official doc. Is that implemented? If not, is there any good simple references on how to implement that?

PengZhenghao · September 2, 2018, 5:46pm

I am not sure that’s your multi-to-one LSTM means, so just an example.

In my practice, my model’s input is a video, and I need to assign an attention weight for each frame. However, due to the limit of the memory, it’s not possible to handle the whole video in one input batch.

So I use a fc layer with one output value as the attention layer. Suppose the input video has 4x frames, but my model can only handle x frames each time. I will let my model run 4 batches and output 4x values by the fc layer. After that, I can apply a softmax to those 4x values and consequently solve the problem.