Seq2Seq - same generation all batches

AfonsoSalgadoSousa · March 13, 2022, 6:14pm

Hi. I have an LSTM-based sequence-to-sequence model for paraphrase generation that seems to train (the loss goes down), but the generations are not good. For each epoch, the generations are all the same. Even though these seem to improve throughout the epochs, they plateau in some random arbitrary sentence and this sentence is the prediction for all entries of the batch.
My generation function is as follows:

def greedy_search(self, decoder_input, state=None):
        batch_size = len(decoder_input)

        seqs = torch.zeros((batch_size, self.max_sequence_length))
        for t in range(self.max_sequence_length):
            words, _, state = self.decode_step(decoder_input, state, k=1,)
            seqs[:, t] = words[:, 0]
            decoder_input = words.tolist()

        return seqs.int()

My decode step is as follows:

def _decode_step(
        self,
        input_list,
        state_list,
        k=1,
    ):

        device = next(self.decoder.parameters()).device
        inputs = torch.stack([torch.tensor(inp).to(device) for inp in input_list])

        state = State().from_list(state_list)
        logits, new_state = self.forward_decoder(inputs, state=state)

        logits = logits.select(1, -1).contiguous()
        logprobs = F.log_softmax(logits, dim=-1)
        logprobs, words = logprobs.topk(k, dim=-1)
        new_state_list = [new_state[i] for i in range(len(input_list))]
        return words, logprobs, new_state_list

Do you have any suggestions on what may be causing this issue? Thanks in advance for the help.