Hi, I am trying to model a state model evolution (beta_t = F(beta_{t-1}) + noise) using LSTM and then use its output and some other input (X) in the linear layer above this LSTM.

Is there a way for me to not give LSTM any input and give it the initial estimate of the hidden state beta_0 so that my LSTM learns the beta evolution, using the MSE loss between my observation y and output of linear layer y_hat?

A bit confused here! Appreciate any help.