Handle important gaps in TimeSeriesDataSet

Eyzzle · April 24, 2025, 9:21am

Hello everyone,

I am currently working on time series forecasting. I have several dataset that I can work with and I would like to combine them into one whole dataset to improve my model training and therefore my predictions. The issue is that my datasets are not continuous in time:

My datasets consist of 1-day values over a period of several years. When joining datasets, I could therefore have values ending in 2014 for one dataset and then starting again in 2018 for another dataset. Obviously I can not fill the data in this gap for my whole dataset to be continuous.
Additionally, in some of the dataset, I have some large missing values (for several weeks) which again, can not be filled automatically.

I would like to take advantage of the TimeSeriesDataSet class but I need to allow gaps in my data. The main advantage of this class is that the handling of multiple covariates is already implemented, aswell as the from_dataset() method when building a NN architecture.

From what I’ve seen in this github discussion, I could set weight to 0, but I am not sure if this is ideal.
Ideally, I would like to be able to override the method to select the time windows: my prediction for the next day is based on the last 7 days, therefore I would like to select windows with continuous data for 7+1 = 8 days. Is there any “easy” way to do this ?

Apart from these two ideas, is there something more appropriate for my issue ? Sorry I am new to pytorch forecasting and could not find solutions when browsing for my problem.

Thank you!