Why does hidden_state requires_grad=False in LSTM

mpry · December 15, 2017, 2:28am

In the time_sequence_prediction example, the hidden state and cell state are initialized with requires_grad=False. Why is this done? Shouldn’t outputs produced be differentiated wrt these?