Let me breakdown my question into a few subset questions:

- Should we pad the sequences to the same length for more accurate results? More accurate reconstructed sequences and low reconstruction error. I have heard how if the sequence are in different length, a reconstructed sequence’s length will be different from the corresponding input sequence, is that so?
- I know how inside each batch, it requires the sequences to be the same length. I am not sure why this is the case? My understanding about having batches is to speed up the training process, and if we apply batch normalization function in PyTorch, we need the batch size to be greater than 1, since it normalized among the sequences in the batch (Correct me if I am wrong here). Then can I train a model without applying DataLoader to my dataset? which means that the batch size is 1, then I can train with different length sequence.
- Also in order to get a more accurate reconstruction result, should I make sure that all of my time-series sequence to have a fixed starting point? Like a similar starting point. (ie. zero-crossings for sinusoidal)