Why does LSTM accept incorect input size

Kai_Hammond · June 22, 2024, 10:25pm

class LSTM(nn.Module):
    def __init__(self, INPUT_SIZE, HIDDEN_SIZE):
        super().__init__()
        self.lstm = nn.LSTM(INPUT_SIZE, HIDDEN_SIZE, batch_first=True)

    def forward(self, input_data):
        lstm_out, _ = self.lstm(input_data)
        return lstm_out

model = LSTM(4, 1)

input = torch.tensor([[[2, 5, 3, 2, 2, 5], [2, 4, 2, 2, 2, 5]],
                         [[5, 2, 3, 2, 2, 5], [10, 2, 7, 2, 2, 5]]], dtype=torch.float32)
input.shape
# torch.Size([2, 2, 6])

model(input)

#tensor([[[-0.1295],
#         [-0.1073]],
#
#        [[ 0.1431],
#         [ 0.2922]]], grad_fn=<TransposeBackward0>)

I believe this shouldn’t work because I set the input size as 4, but am inputting inputs of size 6. Why is this not throwing an error? Am I understanding the calculation of the LSTM incorrectly?

Aniruth_Sundararaja1 · June 23, 2024, 7:07pm

Hi @Kai_Hammond

Your code works because PyTorch allows variable sequence lengths by default when batch_first=True in the nn.LSTM module .

So LSTM by default processes the incoming input to handle variable size input.

For shorter sequences, it ignores the remaining time steps.
For longer sequences, it truncates the excess time steps.