How do I train an LSTM for experiment data which resets after an interval?

Harish_Uthravalli · August 1, 2023, 6:20pm

I am using an LSTM model for Multivariate Time series forecasting where my dataset is a collection of experiments that are performed for a certain time. To train the LSTM model I am creating sequences from that dataset of experiements where at the beginning of the experiements the values are reset to 0 or some initializations.

Example of dataset

index    experiement    temperature    ph     output
  1       exp-1            30          7       0
  2       exp-1            32          7       2
  3       exp-1            33          7       3
  4       exp-1            34          7       4
  5       exp-1            35          7       3
  6       exp-2            30          7       0
  7       exp-2            32          7       5
  8       exp-2            35          7       6
  9       exp-2            36          7       4
 10       exp-2            33          7       3

As the temporal data gets reset at the beginning of the experiment one of my sequence data which is used for training will look like this

index    experiement    temperature    ph     output
  1       exp-1            30          7       0
  2       exp-1            32          7       2
  3       exp-1            33          7       3
  4       exp-1            34          7       4
  5       exp-1            35          7       3
                                               
                                Label: 0

Basically, as the sequence does not know that a new experiment has started it will learn that the above values when trained should predict an output of 0 which is incorrect. If I train only for one experiment the training data will not be enough as I will not be able to produce many sequences from a single experiment.

Is there a certain approach to train such data? How do I feed multiple experiments data to my LSTM without the issues explained above?

Example of training them without considering start of new experiement results in the following output which is not accurate