Doing meaning pooling within a given range to get sentence representation

hongyuanluke · May 20, 2020, 9:22am

Hi, I am wondering how I could perform mean pooling within a given range of word embeddings.

For example, I have a list of word embeddings with length 100. I would like to perform a mean averaging on the 0-35 words and 36-100 words individually. How could I perform this given this list and an index list [0, 35, 100]?

Will be appreciating for any help…

Thanks!

hongyuanluke · May 20, 2020, 9:56am

The current way I have done is to:

    def create_sentence_embedding(self, encoder_states, sentence_indices, sentence_index_num):
        sentence_indices=sentence_indices[:sentence_index_num]
        xxs=[torch.div(torch.sum(encoder_states[sentence_index:sentence_indices[i+1]], dim=0), len(encoder_states[sentence_index:sentence_indices[i+1]]))  for i, sentence_index in enumerate(sentence_indices[:-1])]
        xxs=torch.stack(xxs)
        return xxs

    def forward():
        xxs = [self.create_sentence_embedding(encoder_state, yys[i], sentence_index_num[i][0]) for i, encoder_state in enumerate(encoder_states[0])]
        xxs = rnn_utils.pad_sequence(xxs, batch_first=True)
       
       # pass xxs into a sentence transformer
        xxs, mask, seq_len = self.sentence_encoder(xxs)

       # pass xxs into a feed-forward position-wise network
        xxs = self.sentence_ordering_decoder(xxs)

However, after doing this, my network weight does not update… So I think there is some problem with the way I used above.