Hi all,
I am using pytorch to implement a batch version of sequence to sequence model. The pack_padded_sequence function and the PackedSequence class are helpful.
But for the pack_padded_sequence function, it requires:
The sequences should be sorted by length in a decreasing order, i.e. input[:,0] should be the longest sequence, and input[:,B-1] the shortest one.
(url:torch.nn — PyTorch master documentation)
This seems to be reasonable, but the longest source text may not have the longest target text correspondingly.
For example, the length of source (encoder) is [10, 9, 8, 3], the corresponding target text (decoder) may be [5,7,8,3].
If I want to use the pack_padded_sequence, what i can do is 1) write all the thing by my self, including the mask mechanism for padding. 2) After encoding, shuffle the output matrix of encoder with the decoder sequence length order. But this is time consuming and easy to result in bugs.
Should the length order requirement of these functions be removed?