LSTM for multi-length time series

Hi!

Sorry for the question’s length - I believe it’s always better to be longer than unclear…
I have tried a lot, ang chat-GPT has been my good friend, but I still feel what I’m trying to do should have quite a simple solution, and I’m sure I’m not the first to try something of this kind.

I am trying to classify time series. I have data from a csv, where each sample is a two-row tensor of two different measurements. The rows are all 1500 items long, meaning each item of the data set (after transposing it) should be of the shape of [1500,2]. Thing is, the series are not actually that long - it is just the maximum series length, so the vast majority of the series are padded with zeros. I want to use an LSTM model to classify those series.

Thing is, I saw the simple solution is using the packed_data = nn.utils.rnn.pack_padded_sequence in the forward function. But after trying it, I found it weird that the packed_data.data object is a tensor with all the batch items concatenated, and that the LSTM layer is applied to all at once, which means the result of one affects the other (is that really the case?). After asking chat-GPT it told me that when the lengths are very different in size (which they are in my case), it is better to use dynamic unrolling. But that seemed very weird, cause you need to create your custom dataloader (Because if get_item produces different sized items the regular dataloader won’t work), or put the items in one at a time.

So, down to the question (and another one after that):

  1. How do I train an nn.LSTM model that trains only on the relevant data? My data set already has an option of returning the actual length of every item before padding. I just want it to get an item (X) and it’s length (L) and apply the layer only on the relevant part (X[:L,:]). Shouldn’t it be quite a common task?
  2. after I have done that and trained the model, how can I use the model to classify a single item, but get the hidden state (and classification after the nn.Linear layer) after every step? In the end that would be my case - I will have the data coming in step by step, and classify it in real time.

Thank you very much!