How to initialize the hidden state of a LSTM?

in order to use LSTM, you need a hidden state and a cell state, which is not provided in the first place. My question is how to you initialize the hidden state and the cell state for the first input? If it is randomly initialized then if I feed into the second input, the same initialization should also work to predict the next output. But it does not make sense to me that inputting different hidden/cell state would come up with the same output.

1 Like

Even though that is in the context of BPTT, I think similar considerations apply as discussed the first item in:

Best regards