Hi,
It is mentioned in the documentation of an LSTM, that if batch_first = True
for pack_padded_sequence
input to LSTM (bi-directional), the last hidden state output is also of shape (batch, num_directions, hidden_size)
. However, the output of the last hidden state appears to be of shape (num_directions, batch, hidden_size)
, even though batch_first
is set to true. This is the code:
import torch
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence, PackedSequence
from torch import nn
sequences = torch.FloatTensor([[1, 2, 0, 0, 0, 0], # length 2
[3, 4, 5, 0, 0, 0], # length 3
[5, 6, 0, 0, 0, 0], # length 2
[8, 9, 10, 11, 12, 0]
]) # length 5
seq_lengths = torch.LongTensor([2, 3, 2, 5])
rnn = nn.LSTM(1, 5,
batch_first=True,
bidirectional = True)
packed_sequences = pack_padded_sequence(sequences.unsqueeze(2),
lengths=seq_lengths,
batch_first=True,
enforce_sorted=False)
rnn_output, (hn,cn) = rnn(packed_sequences)
print(hn.shape) # torch.Size([2, 4, 5]). It should be torch.Size([4, 2, 5])?