I was getting through this explanation on packing sequences and using them, but I do not understand how to get the last output for each batch after using pad_packed_sequence, so that it could be then fed into a linear layer. The problem is that for shorter sequences the padded steps’ outputs are simply 0. Maybe as a solution to this, we should index output with the sequence lengths?
Update: It seems to be a solution for me:
def forward(self, x, seq_lengths):
seq_lengths, perm_idx = seq_lengths.sort(0, descending=True)
x = x[perm_idx]
x = self.embed(x)
x = x.permute(1, 0, 2)
x = pack_padded_sequence(x, seq_lengths)
x, _ = self.rnn(x)
x, input_sizes = pad_packed_sequence(x)
last_output = torch.empty(len(input_sizes), self.hidden_size, dtype=torch.float32)
for i, inp_size in enumerate(input_sizes):
last_output[perm_idx[i]] = x[inp_size-1, i]
x = self.linear(last_output)
x = self.sigmoid(x)
return x