How pack_padded_sequence works with the hidden states?

AlexisW · March 13, 2019, 8:34pm

I am trying to have self attention in LSTM and have trouble getting it working. I know in terms of the output (first return value of LSTM) we could unpack and unsort it. However, how’s this working with the hidden state of the hidden states? Thanks for any help and pointers!

Kushaj · March 14, 2019, 4:04pm

You can look at the fastai implementation of the SelfAttention layer link