Why lengths should be given in sorted order in pack_padded_sequence?

Hi there,

I am training a hierarchical model using Pytorch. This proved to be considerably difficult as the masking needs a sorted list of lengths. In a hierarchical model, I can sort the samples based on sentences and get away with padding sentences. But I will definitely have to mask the words, and Pytorch only supports them in a sorted order.

Is there a reason for enforcing the sorted order, some sort of optimization? Can I make it take inputs without the sorting?

Regards
Sandeep

this restriction is a cudnn restriction that we’re dealing with :frowning:

1 Like

Yeah, this puts a lot of constraints on any complicated networks that I want to try.

I can use GRUCell, but it does not have support for bidirectional or masking!

I have also been working with padded sequences and the need to order them. I am currently running into a problem where I have 2 different inputs and 2 RNNs. I can sort both inputs according to their length, but then the respective dimensions don’t match each other anymore, so I have to re-instantiate the original order after running through the RNN (basically as shown here: RNNs Sorting operations autograd safe?). I’m just worried that this operation is not autograd-safe, i.e. that the gradients get lost or are associated with the wrong matrix entries after resorting.

Do you know about similar problems or can help?

1 Like

is it still the case ?

Does anyone know how to do ?

Just out of curiosity, perhaps after 2 years its been solved so wanted to bump this so to document if it was solved :slight_smile: pack_padded_sequence

Thanks for the help!

it has been solved now. you can give unsorted lists and it’s fine.

1 Like

Hey @smth, we can give an unsorted list to pack_padded_sequence and it is going to work fine?

The documentation still talk about sorting. Thus the issue seems unresolved, maybe the docs should be updated.

https://pytorch.org/docs/stable/nn.html#pack-padded-sequence

For unsorted sequences, use enforce_sorted = False. If enforce_sorted is True , the sequences should be sorted by length in a decreasing order, i.e. input[:,0] should be the longest sequence, and input[:,B-1] the shortest one. enforce_sorted = True is only necessary for ONNX export.


git issue: update docs that sorting is not needed in · Issue #23079 · pytorch/pytorch · GitHub