Hi,

In some time series prediction, e.g. stocks, weather forecast, the sequences are generally very long. My current method is using sliding window to slice a long sequence to many shorter and fixed-length sequence. I works but very slow. However, I noticed that since the differences among the neighboring sequences are very small, most computation are wasted.

A natural method is feeding the long sequence directly to the network, iterating over it one step at a time for sever steps as a batch, then calculating the loss, back-propagating the gradients and updating the weights, then doing next batchâ€¦ I think this can save over 99% computations. Is it possible? How to implement this?

Thanks!

Ben