How to do padding at decoder side?

Gaurav_Sharma · July 21, 2022, 7:54pm

Hi there,

I am working on an image to seq model, where my encoder input is an image while the decoder output will going to be a sequnece. I am familiar with the pack_padded_seq for the seq2seq model. But in my case, there is no seq at the encoder side. And I am facing some problems implementing it for sequences on the decoder side. The problem is:
for pack_padded_seq, we need to provide a sorted tensor of seq_lengths, but while working with the decoder, we only pass one token at a time i.e. the shape of the decoder input tensor will going to be [batch]. That means, all the seq_lens will going to be the same i.e. 1. How will I be able to do pack_padding with this condition?

for e.g.:

trg = <tensor of shape [len, batch]>

dec_src = trg[0,:]   # [1, batch]
 
for t in range(1, trg_len):
    output, hidden, cell = self.lstm_decoder(dec_src, encoder_out, hidden, cell)     # O: [B, out]   H: [1, B, Hid]
    outputs[t]=output
    top1 = output.argmax(1)     # [batch_size]

Any suggestions will be appreciated! Thank you!