I was getting through this explanation on packing sequences and using them, but I do not understand how to get the last output for each batch after using pad_packed_sequence, so that it could be then fed into a linear layer. The problem is that for shorter sequences the padded steps’ outputs are simply 0. Maybe as a solution to this, we should index output with the sequence lengths?
Update: It seems to be a solution for me:
def forward(self, x, seq_lengths): seq_lengths, perm_idx = seq_lengths.sort(0, descending=True) x = x[perm_idx] x = self.embed(x) x = x.permute(1, 0, 2) x = pack_padded_sequence(x, seq_lengths) x, _ = self.rnn(x) x, input_sizes = pad_packed_sequence(x) last_output = torch.empty(len(input_sizes), self.hidden_size, dtype=torch.float32) for i, inp_size in enumerate(input_sizes): last_output[perm_idx[i]] = x[inp_size-1, i] x = self.linear(last_output) x = self.sigmoid(x) return x