Hi, I have a sequence input: input.size()=[17, 1, 64] #(seq_len, batch, channel)
, and use nn.LSTM(64, 16, bidirectional=True)
to get output, (hn, cn)
.
In such case, output.size()=[17, 1, 32], hn.size()=[2, 1, 16], cn.size()=[2, 1, 16]
. I concat (hn, cn)
as final feature for next application, which works well.
When there exists multiple inputs with variable lengths, I try to use pad_sequence
and pack_padded_sequence
to make a batch, i.e., a PackedSequence
object. The model forwards successfully. However, when I using
h_n.view(num_layers, num_directions, batch, hidden_size).permute(2,0,1,3).contiguous().view(batch, -1)
to obtain final feature for each input sequence, the next applications get poor performance.
So, here the questions:
(1) The (hn, cn)
represent the final time step for the entire batch? or exactly represent the final time step for each variable seq?
(2) If for the entire batch: how can I get the accurate final status (h, c) for each variable seq?
(3) If already for each seq: why the performance drops? Is there other problems which I should check?