Difference between inputting all data in a batch and sending all data in for-loop?

So, I’ve trained an LSTM autoencoder to reconstruct time-series data. It’s trained to encode/decode 7-day sub-sequences of the time series.

When I test it on a single subsequence, it reconstructs the data very well. I can also reconstruct the entire time-series by running each sequence through the model in a for-loop. It gets really close, and I’m very happy.

However, what I thought I could do was run all the sequences through the model in parallel by making an input of shape (batch_size, seq_len, n_features) where seq_len is my 7 day sub-sequences, and so batch_size = len(timeseries) // seq_len.

However, the model does not correctly reconstruct the time-series with this method. Why not? Isn’t this exactly how the model trained and updated the gradient?