Hello everyone,
I am trying to create a LSTM model to forecast financial time series and the data is split into train,validation and test. The model is trained on the train, fine-tuned on the validation and then 14 observations are left in the test set, to check it on unobserved data.
I have configured my lstm network to be stateful, i.e. i have an initial hidden state, which is updated in the batch and passed to the next batch. Of course, this makes sense for time-series data since it is sequential. I would like to keep those “warm” hidden states to be passed to the validation data as well and eventually to the test as well.
However, i run into the following problem:
The training/validation data is split into equal batches, but of course the last batch in each set has a different size. So, this means that for the last batch in the training i need a different shaped hidden and cell states. Then those need to be passed to the validation as well, with the original shape, since the first batches of the validation are the same as the train batches.
At the moment, i just drop the last batches from the train and validation data, but is there a more sustainable solution?
Thank you in advance!