Training A Forecasting Model With CNN's and LSTM's

I have been working on a prediction problem for quite some time.

My model has the input shape of [Sequence Length , Attributes] and tries to predict a single attribute at next time step.

I used very similar model before and got a good result on a very similar problem. Also, I am trying to keep my model as computationally cheap as possible to use in some part of my thesis.

Some of the attributes have critical outliers.

Aim: Predict Next Time Steps Attribute, append it to the current sequential data and keep predicting. Removing the oldest time step,and keeping sequence length at a fixed length

My Model Description: 5-6 CNN Layers 1 LSTM Layer 1 FC Layer (With ReLU Activations Except LSTM of course.)(This is just to show model idea. Layer Count Type etc. can be changed.)
I am able to choose, sequential length of the input data

What I Tried:

  1. Increased & Decreased CNN Layers
  2. Changing Filter Sizes.
  3. Added Batch Normalization Layers between CNN’s
  4. Added Max Pooling, Average Pooling & Striding On a Sequence
  5. Increased & Decreased Channel Sizes for CNN Layers
  6. Used GRU’s , LSTM’s and RNN’s for sequential Part
  7. Increased & Decreased Hidden Dimension of Sequential Layer
  8. Tried multiple Sequential Layers
  9. Tried Variety of Sequence Lengths
  10. Added Multiple FC Layer’s at the end of Sequential Layer
  11. Tried Transforming the data, “Yeo-Johnson”, Logarithmic, Cubic etc.
  12. Tried Only Sequential Layer-Based & Only CNN Based Models
  13. Different Batch Size, Much Larger Training Set , Different Number of Epochs…

Forecasting made with model:

I kind of ran out of ideas at this point. Since, I am not very experienced in DL, but a long review of multiple sources, didn’t help me at this point. What seems to be the problem here ? How can I approach this model ? What are the fundamental methods I can try here ?

Thank you for your time.

Can you provide the code and a small sample of the data? It’s hard to know whether there is an error in your code or you are using an inappropriate model.

TSAI may be worth looking into. Also, I recently had surprising success with an extremely simple, one-layer LSTM for multivariate time-series prediction (validation MAPE reached <1%, beating 10+ other architectures I tried). It may be worth following this example on your dataset.