Order requirement of rnn.pack_padded_sequence

jiacheng-xu · October 5, 2017, 12:47am

Hi all,
I am using pytorch to implement a batch version of sequence to sequence model. The pack_padded_sequence function and the PackedSequence class are helpful.
But for the pack_padded_sequence function, it requires:

The sequences should be sorted by length in a decreasing order, i.e. input[:,0] should be the longest sequence, and input[:,B-1] the shortest one.

(url:torch.nn — PyTorch master documentation)

This seems to be reasonable, but the longest source text may not have the longest target text correspondingly.
For example, the length of source (encoder) is [10, 9, 8, 3], the corresponding target text (decoder) may be [5,7,8,3].

If I want to use the pack_padded_sequence, what i can do is 1) write all the thing by my self, including the mask mechanism for padding. 2) After encoding, shuffle the output matrix of encoder with the decoder sequence length order. But this is time consuming and easy to result in bugs.

Should the length order requirement of these functions be removed?

SimonW · October 5, 2017, 10:20pm

The requirement is used in the implementation of RNN. I doubt that removing those would be easy and the correct thing to do. What I would do is to sort the input sequence by length, store the indices, and use the indices to retrieve corresponding target to compare.