How to use the nn.TransformerEncoder

jaemin · June 6, 2020, 8:38am

Hello.
I made the Transformer encoder to replace RNN encoder. It seems like training well because CE loss continuously decreases but accuracy is the problem. I think this model can’t find token because the prediction shows ‘E’ of ‘I’ after the end of sentence.
For example,
Prediction:
AND TN TIOK TOOA D NH AHEULD TOT TORLYW AHE E WN TNDTVT TF TOU TOC LNEON ONSHETL TOT TNTEOE TAOAENNYAN N TYST AASL D AH TITTE TH TEALEED TERWAICE TN WN TN THE SASH D TORTENN E EEEEEEEEEEEEEEEEEEEEEEEEEIEEEEIEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
Answer:
AND IF LORD FREDERICK SHOULD NOT FOLLOW THERE IS AN END OF YOUR SUSPICIONS I SHALL NOT EASILY PREVAIL UPON MISS MILNER TO LEAVE TOWN REPLIED HE WHILE IT IS IN THE HIGHEST FASHION

I simply use a nn.TransformerEncoder module, and replace

x = nn.utils.rnn.pack_padded_sequence(x, output_lengths)
x, h_state = self.rnn(x)
x, _ = nn.utils.rnn.pad_packed_sequence(x)

to

x = self.encoder(src=x)

The input tensor “x” is shape like (Sequence_length, Batch_size, Feature_dim)
Is there any additional method to use nn.TransformerEncoder ?