When to call init_hidden() for RNN

Could be wrong but I think that’s only when there is no ordering between subsequent sequences so there’s no good reason one would want to preserve the state.

In a Language model for you inithidden at the start of the epoch only and instead just detach the graph (‘repackage_hidden’) at the start of each sequence to make you dont backprop between batches. In keras I think this would be equivalent to passing stateful=True as a parameter to the LSTM. (can someone please confirm this though?).