Error "start () + length () exceeds dimension size ()" on using pack_padded_sequence output as input for LSTM layer

jagandecapri · September 1, 2021, 3:05pm

I am trying to use pack_padded_sequence function to feed the output of the function into a LSTM layer as following:

class LSTMClassification(nn.Module):

    def __init__(self, input_dim, hidden_dim, target_size):
        super(LSTMClassification, self).__init__()
        self.hidden_dim = hidden_dim
        self.lstm = nn.LSTM(input_dim, hidden_dim, batch_first=True, num_layers=3, dropout=0.3, bidirectional=True)

        # The linear layer that maps from hidden state space to tag space
        self.fc = nn.Linear(hidden_dim*2, target_size)

    def forward(self, input_):
        input_lengths = torch.LongTensor([torch.max(input_[i, :].data.nonzero()) + 1 for i in range(input_.size()[0])])
        # Then pack the sequences
        packed_input = nn.utils.rnn.pack_padded_sequence(input_, input_lengths.cpu().numpy(), batch_first=True, enforce_sorted=False)
        lstm_out, (h, c) = self.lstm(packed_input)
        logits = self.fc(lstm_out[:,-1])
        return logits

However, I am getting an error stating “start (384) + length (8) exceeds dimension size (384).”. I couldn’t debug what is the cause of the error. I did a small simulation code to reproduce the error as following. The error can be seen when the following code is ran.

model = LSTMClassification(76, 
                        hidden_dim=8, 
                        target_size=1)

input_ = torch.randn(8, 48, 76)
model(input_)

The issue seems similar to I used pack_padded_sequence() and put in lstm layer, but I got start () + length () exceeds dimension size (). error but I don’t see any clear answer on how the problem was solved in that link.

ptrblck · September 2, 2021, 3:58am

I’m not sure, if I completely understand your use case, but since you are using batch_first=True, the input is expected to have the shape [batch_size, seq_len, features], so 48 would be the sequence length? If so, then input_lengths would be wrong since it’s returning a tensor containing 76 for each sample. Once this is fixed, you would have to check the lstm_out, as it would also be a PackedSequence, so I guess you might want to access the return value via lstm_out.data.

jagandecapri · September 2, 2021, 5:16am

@ptrblck To give more context, I’m trying to port a model from Keras to Pytorch. Specifically, the model in Keras has a Masking layer applied.

Referring to this blog - section “How the PackedSequence object works”, what I understood is that the input_lengths refers to the length of the longest sequence in a batch where the length is defined as the last non-zero element in the sequence.

Did I understood the concept wrongly?

jagandecapri · September 3, 2021, 1:59pm

I think I got the idea of input_lengths for pack_padded_sequence by following this gist. It is the number of non-zero sequences in a batch. In my case, the number can go up to 48 which is the maximum number of sequences in a batch.