Acceleration for very long time series training


In some time series prediction, e.g. stocks, weather forecast, the sequences are generally very long. My current method is using sliding window to slice a long sequence to many shorter and fixed-length sequence. I works but very slow. However, I noticed that since the differences among the neighboring sequences are very small, most computation are wasted.

A natural method is feeding the long sequence directly to the network, iterating over it one step at a time for sever steps as a batch, then calculating the loss, back-propagating the gradients and updating the weights, then doing next batch… I think this can save over 99% computations. Is it possible? How to implement this?