Average of the GRU/LSTM outputs for variable length sequences

Alternative, you might want to consider creating batches of equal length for all input and target sequence pairs within a batch. This completely avoid padding (and optional packing).

1 Like