Do we need to set a fixed input sentence length when we use padding-packing with RNN?

The RNN see each word, i.e., a vector of size 5, step by step. If there are 6 words, the RNN sees 6 vectors and then stops. Same with 8 words. Your confusion might stem that LSTM or GRU hides this step-wise processing. You give the model a sequence of a certain lengths, but internally the model loops over the sequence. More words just means more loops before it’s finished.

Obviously, things get problematic with batches if the sequences in a batch have different lengths. One default solutions is to pad all short sequences to the length of the longest sequence.

The size/complexity of the model (the number of neurons of you will, but it’s better to think in number of trainable parameters) of the LSTM/GRU depends on:

  • the size of the input (e.g., 5 in your example)
  • the size of the hidden dimension
  • number of layers in case of a stacked LSTM/GRU
  • whether you use uni- or bidirectional.

It does not depend on the sequences lengths. Sure, the processing takes more time for longer sequences.

3 Likes