What to do if your model ignores the input and learns the labels?

Hi everyone,

I’m working on this time-series regression problem and I’ve already gone through the following stages:

  • prepared different datasets by adding first only the series itself, then moving averages, then sentiment data, etc;

  • trained benchmarks: persistent models, linear regressions, ARIMA, …

  • tried a variety of different deep learning archtitectures (MLPs, resnets, wavenets, lstm, etc.)

So, what happens is that no matter (i) how the dataset is built and (ii) how complex or fancy the architecture is but the model always end up ignoring the input and predicting as output at timestep t the input a timestep t-1, which is called a persistent model in literature and (it’s one of the benchmarks)

TL; DR:

Time series framing problem: DL models (of several architectures) end up totally ignoring the input and learn to give always the same prediction: ŷ(t) = x(t-1)

Q: How to address this issue? Is there a way to penalise this behaviour during the training?