Why do we need to pad the input for variable length sequences for lstm when there is a pack_padded_sequence function that essentially tells the lstm to ignore the padded portion? Why isn’t the pack_padded_sequence function with the sequence lengths sufficient for training via mini batches? Why do we need to pre-pad the input ?
hopefully we’ll get an answer some day.
It’s because every row in a tensor has to have the same dimension. There’s no other way, as pytorch does not support python lists for this method.