If I pad the sequence input and use a masked loss, does it still make sense to use pack - unpack operation for RNN layers?

As far as I can follow, there are couple of ways to give batch sequence input to RNN based model. My way is;

  1. Sort inputs by the length
  2. Pad each batch by the max-len in the batch
  3. Use masked loss to avoid the side-effect of the padding.

I also see people suggest to use pack and unpack operations with the RNN layer. Does it really make sense in my setting or is it just another way to solve the same problem? Or does it also propose something else that I overlook?