Making a batch of 2d list with varying length for non rnn purposes

each of my samples has this structure


The outer list can have up to 100 indices and each of those inner lists can have up to 140 indices and each of them is an embedding index.
I’m trying to calculate a meta embedding for each inner list and merge them into 1 embedding per inner list and from there calculate my attention layer.
My problem is how can I batchify this. Since I need two layers and padding and packing and it’s not supported by the standard pad and pack functions. Any hint could be useful.
thanks in advance.