Variable Length Sequences for Many-to-One RNN

Yes, that’s exactly what I’m doing here @karmus89 :slight_smile:
Although to be fair, this has lead to an increase in my training by 5x as compared to just taking the last time output of the tensor (regardless of the input length) because there’s a lot of back and forth between the cpu and gpu, but that model would be suboptimal in performance in any case. Let me know if you happen to find a more computationally efficient way of doing this!