Variable input size to LSTM

Hi Everyone,

I am new to using LSTMs. I have the following requirements:

Input to lstm: [30, 16, 2]
Output from lstm: [256, 1]

Currently, as per the documentation, the input can be of a specific length, say n. That is one dimensional. How to have a 3d input?

E.g. ( batch size , sequence length , input dimension ) : I want the “Input dimension” as [30,16,2]

Also, in this case, what exactly is the “Sequence length”?


When LSTM in contrast to LSTMCell consumes a sequence.

In this small gist I demonstrate it’s usage with batches:

Thanks for the reply. However, I want the input itself to be 3d, not including the batch dimension.

Hope you can help me with that.

Could you flatten your input and split the output of the LSTM?
I might be wrong, but LSTM has no interaction between the values in one sequence item.

Sorry, but the requirement is that the input needs to be fed in as 30 samples of [16, 2] matrices, and the context vector after encoding needs to be [256, 1]. Cann’t change that.

If it’s not possible in torch, gotta do it in tensorflow.

You could use flatten or view on your input.

Sorry I don’t get it it. The sequence (and batch size) of input in output stays the same.
You can modify the shape with view if it makes sense for you.

lstm = nn.LSTM(32,256,1)
# x of dimension 16x2

y, hc = lstm(torch.view(1, 32), hc)
y = y.view(256,1)

Sure, I’ll give it a try.

256 is the sequence length, right? It’s the same as the output. Can u please explain what it signifies.


LSTM is used for sequence input, which is a tensor of variable length. (you can pad it, such that all sequences have the same fixed length).

In your example I can not see the sequence length. Is it 30, 16 or 256.

Sorry I am confused.

Okay. Let me explain. I need an LSTM to accept 30 samples of (16,2) pose keypoints (they are coordinates on an image, 30 such images) and give out a single vector of length 256.

Hence, I need a single input batch of 30 samples, each with 16 keypoints, which are coordinates.

Hope it makes things clear.

I am not sure if LSTM is the correct module for something like that directly.
But maybe the last output of the LSTM contains the information enough to compute the “single vector of length 256”. You could try index_select and having several linear modules in order to map the sizes of the vectors.