My task is an order sensitive problem.
So I don’t want to sort my mini-batch by its sequence length to use pack_padded_sequence function.
I just realized that an output of LSTM differs before and after using the pack_padded_sequence function.

Could anyone explain this result?

I think it would be much easier to comment if you had an example. One would expect forward outputs up to the lengths to be the same.
That said, packing and unpacking to the original order can be a good option, too.

Best regards