Here is the code that attempts to learn to encode for paragraphs (the set of sentences that are themselves encoded).
class TransformerEnc(nn.Module): def __init__(self): super(TransformerEnc, self).__init__() d_model = 1024 self.posenc = PositionalEncoding(d_model) encoder_layer = nn.TransformerEncoderLayer(d_model, nhead=8) self.model = nn.TransformerEncoder(encoder_layer, num_layers=6) def forward(self, x): x = self.posenc(x) output = self.model(x) output = torch.sum(output, 1) return output
forward function gets tensor
x of shape
[64, 20, 1024].
Where 20 is the max number of sentences in a paragraph and 1024 is encoded sentence.
Now I need the
output of the size
[64, 1024] where 20 sentences got embedded in one 1024 dim vector.
When I train a model with this embedding I get a random accuracy. If I replace this code with LSTM it trains well. So clearly there is some issue.
Any ideas what might be the problem?