Variable length inputs for RNN. And Yes, I have checked similar question asked in forum first

I have worked through to understand pad_packed_sequence and pack_padded_sequence. But it does not have codes for a full training example. I am assuming the task at hand is seq2seq of same length for output and input, e.g. language model.

Some training example I found on line follows the following pattern, which I found sub-optimal.

  1. pad so that all have same length
  2. use pack_padded_sequence to pack
  3. pass it through RNN, possible MLP on top of output of RNN
  4. unpack via pad_packed_sequence
  5. multiple by mask of zero for shorter sequence to stop their gradient

I think it would be better if we do:
1,2,3 same as before
4. pack the target variable
5. loss = loss_fun(packed outputs from RNN, packed target)

This way I think you save computation. Let me know if you think this is correct or point me to a full training cycle codes? And I think variable length is an common enough topic to include in official tutorial.



Can I re-pose unanswered question?