This is a great question, one that I am struggling with too.
I see maybe you figured some things out for yourself. I wish that someone with some definitive knowledge had answered it for you. How sure are you that your approach is now correct?
I’m afraid that the LSTM model in pytorch has been very hard for me to wrap my head around compared to other CNNs.