I’m training an LSTM on sequences of variable sizes, padded to all be the same size. The LSTM trains successfully, but sampling from it yields a lot of of the padded character. I’ve seen a lot of implementations of padded sequences with pad_packed_sequence and the like, but I already have the sequences padded. Is there an example of how to have the LSTM only backpropagate over the actual sequence? I haven’t found anything in the docs for torch.nn.lstm?
Thanks