Getting output after unpacking padded sequence

Kromel · February 8, 2019, 9:05pm

I was getting through this explanation on packing sequences and using them, but I do not understand how to get the last output for each batch after using pad_packed_sequence, so that it could be then fed into a linear layer. The problem is that for shorter sequences the padded steps’ outputs are simply 0. Maybe as a solution to this, we should index output with the sequence lengths?

Update: It seems to be a solution for me:

def forward(self, x, seq_lengths):
    seq_lengths, perm_idx = seq_lengths.sort(0, descending=True)
    x = x[perm_idx]
    x = self.embed(x)
    x = x.permute(1, 0, 2)
    x = pack_padded_sequence(x, seq_lengths)
    x, _ = self.rnn(x)
    
    x, input_sizes = pad_packed_sequence(x)
    last_output = torch.empty(len(input_sizes), self.hidden_size, dtype=torch.float32)
    for i, inp_size in enumerate(input_sizes):
        last_output[perm_idx[i]] = x[inp_size-1, i]
    
    x = self.linear(last_output)
    x = self.sigmoid(x)

    return x

tom · February 9, 2019, 10:47am

That looks like a sensible approach ~~but you need the inverse permutation to undo the sorting~~ (you have that, sorry for the confusion). If you use input_sizes as first and an arange in the second indexing object, you can merge the loop and stack into an advanced indexing operation (there might also be an index_something function that does that for you without needing the arange).

Best regards

Thomas

Kromel · February 9, 2019, 11:40am

Thank you for your reply, Thomas, but I did not get much what you mean. Could you make a simple example, if possible?