Seq2seq for NMT why the decoder keeps predicting repeated tokens?

I have build a encoder decoder architecture for machine translation. During the inference, I found that the decoder is generating repeated words, something like:

tensor([[ 2,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  9, 11,  4,  6, 13,
          5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,
          6, 13,  5,  4,  6, 13,  5, 12,  5,  4,  6, 13,  5,  4,  9, 11,  4,  6,
         13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,
          4,  6, 13,  5, 12,  5,  4,  0],
        [ 2,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  9, 11,  4,  6, 13,
          5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,
          6, 13,  5,  4,  6, 13,  5, 12,  5,  4,  6, 13,  5,  4,  9, 11,  4,  6,
         13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,
          4,  6, 13,  5,  4,  6, 13,  0],
        [ 2,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  9, 11,  4,  6, 13,
          5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,
          6, 13,  5,  4,  6, 13,  5, 12,  5,  4,  6, 13,  5,  4,  9, 11,  4,  6,
         13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,  4,  6, 13,  5,
          4,  6, 13,  5,  4,  9, 10,  0]], device='cuda:0')

I set the beam size to be 3 in this case. The model takes sequence of characters as input and the decoder predicts at each time step a character. I don’t know why this is happening. Any help will be great!