Ignore padding for LSTM batch training

I realize there is packed_padded_sequence and so on for batch training LSTMs, but that takes an entire sequence and embeds it then forwards it through the LSTM. My LSTM is built so that it just takes an input character then forward just outputs the categorical at each sequence. So I built it so that I pad the sequences before hand so they’re equal length then each index is fed in sequentially. Unfortunately this also means the first characters are pad characters (Because I use prepadding). Is there a way to get the LSTM not to backprop on inputs that are just pads using this LSTM setup?

1 Like