MultiHeadAttention not working in Sentiment Classifier

Michael_Ringer · February 22, 2021, 7:43pm

Hey,

i currently want to implement a imdb sentiment classifier with a transformer (encoder block). When i remove the encoder blocks from the network i get better results as with them.

And when i plot the attention matrix which as far as i know should show a vertical line just shows some straight lines.

Colab link: