Multivariate time series forecasting and LSTM: When should I separate time series in different inputs

zaratruta · April 4, 2022, 1:25am

Let us suppose that I have a multivariate time series with two variables that vary together in time: var1 and var 2. And let us suppose that I want to forecast the n-ith value of var 2, by considering a window with the n-1 past values of var1 and var2.

I would like to use a LSTM in the first layer.

I’m not sure if it would be better to consider a single input with 2 dimensions (providing the n-1 values of both variables) or if it would be better to separate both variables in two different inputs.

Is there any rule or heuristic for guiding this decision?

Best regards.

blueeagle · April 5, 2022, 7:20am

What do you mean with „two different inputs“?

You can consider your multivariate time-series a sequence, and both variables a Feature of it. Then, you can use this sequence as input for your LSTM, i.e. during each time-step i the LSTM receives a vector of the values of var1 and var2 at time i as input.

You can initialize your LSTM to produce a single value as output, wich is var2 at time i+1.

This would allow you to train the LSTM in a straightforward way, as your labels are just the var2 channel of your time-series shifted by one time-step. During Inference you could only use the last generated output of your LSTM, which is the prediction of var2 at time-step N.

And just for the sake of completeness: Training of LSTMs can be rather slow as you will need an algorithm named Backpropagation through time, that is computationally costly. If you always want to make your predictions based on time-windows of same length, you could also consider using a regular CNN

somayyeh_hasanzadeh · May 19, 2023, 6:54am

Hi dear @ptrblck
I confused, I have a CSV dataset that has 47 columns, and I want to predict glucose after 30 minutes. every glucose level measures for 5 minutes by CGM. Only the glucose change with time and others are static. now I don’t know what should I do for my model. can you help me please?
Thank regards

this is the schema of my data