Two things to consider
-
(Deep) Neural Networks require large amount of data to yield good predictions. According to the table, you have 2.8k sentences, which is not much. I assume you show the scores for the test data. How about on the training data?
-
Without specifically having tested that code, I’m pretty sure that the line
x=x.view(x.shape[1],x.shape[0],-1)causes a problem; please read this older post of mine. Tryx = x.transpose(0,1).