Variable input size to LSTM

Anshumaan_Dash · November 18, 2019, 4:21pm

Hi Everyone,

I am new to using LSTMs. I have the following requirements:

Input to lstm: [30, 16, 2]
Output from lstm: [256, 1]

Currently, as per the documentation, the input can be of a specific length, say n. That is one dimensional. How to have a 3d input?

E.g. ( batch size , sequence length , input dimension ) : I want the “Input dimension” as [30,16,2]

Also, in this case, what exactly is the “Sequence length”?

Thanks

matthias.l · November 18, 2019, 8:19pm

When LSTM in contrast to LSTMCell consumes a sequence.

In this small gist I demonstrate it’s usage with batches:

gist.github.com

https://gist.github.com/lochbrunner/d9c77c3fbfd5301154c070561e294ed7#file-bieber_lstm-py-L139

bieber_lstm.py

#!/usr/bin/env python

import torch
import torch.nn as nn
import torch.nn.utils.rnn as rnn
import torch
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torch.nn import functional as F

This file has been truncated. show original

Anshumaan_Dash · November 18, 2019, 8:23pm

Thanks for the reply. However, I want the input itself to be 3d, not including the batch dimension.

Hope you can help me with that.

matthias.l · November 18, 2019, 8:29pm

Could you flatten your input and split the output of the LSTM?
I might be wrong, but LSTM has no interaction between the values in one sequence item.

Anshumaan_Dash · November 18, 2019, 8:32pm

Sorry, but the requirement is that the input needs to be fed in as 30 samples of [16, 2] matrices, and the context vector after encoding needs to be [256, 1]. Cann’t change that.

If it’s not possible in torch, gotta do it in tensorflow.

matthias.l · November 18, 2019, 8:42pm

You could use flatten or view on your input.

Sorry I don’t get it it. The sequence (and batch size) of input in output stays the same.
You can modify the shape with view if it makes sense for you.

lstm = nn.LSTM(32,256,1)
# x of dimension 16x2

y, hc = lstm(torch.view(1, 32), hc)
y = y.view(256,1)

Anshumaan_Dash · November 18, 2019, 8:46pm

Sure, I’ll give it a try.

256 is the sequence length, right? It’s the same as the output. Can u please explain what it signifies.

Thanks.

matthias.l · November 18, 2019, 9:01pm

LSTM is used for sequence input, which is a tensor of variable length. (you can pad it, such that all sequences have the same fixed length).

In your example I can not see the sequence length. Is it 30, 16 or 256.

Sorry I am confused.

Anshumaan_Dash · November 18, 2019, 9:05pm

Okay. Let me explain. I need an LSTM to accept 30 samples of (16,2) pose keypoints (they are coordinates on an image, 30 such images) and give out a single vector of length 256.

Hence, I need a single input batch of 30 samples, each with 16 keypoints, which are coordinates.

Hope it makes things clear.

matthias.l · November 24, 2019, 8:29pm

I am not sure if LSTM is the correct module for something like that directly.
But maybe the last output of the LSTM contains the information enough to compute the “single vector of length 256”. You could try index_select and having several linear modules in order to map the sizes of the vectors.