PackedSequence for non rnn model

I have a custom transformer-like model which can receive a sequence of various length.
I’d like to parralelize across gpu’s. so i wish to pack multiple sequences of various length in one tensor so i can use DataParallel.
I am looking for something like PackedSequence but that i can conveniently unpack at the model, and get the sequences after the padding was removed.
Any existing implementations or tips how to do this?