I have, as an input, 2 sequences of vectors.
Sequence 1 is passed to an RNN. The RNN is initialized with an hidden state of zeros. I collect the output hidden state, and save it. Let’s call this output H1.
I do the same with Sequence 2. I use the same RNN as before. Again, I initialize it’s hidden state to zeroes, and collect the output hidden: H2.
I then use another network to compare H1 and H2, get my predictions, and thus I can compute the loss and perform the backward + optim step.
Is this design correct ? Can I reuse the same RNN before performing the backward pass? Is it ok even if I reset the hidden state to 0?