Hi everyone,

I’m trying to implement a transformer model for time series forecasting. E.g., using the previous ten time steps x_1, x_2, …, x_10 to predict the next five time steps x_11,x_12,…,x_15. When predicting x_11, x_1 to x_10 should be fed into the encoder and x_10 should be fed into the decoder, and the output decoder is \hat{x_11}, i.e., the predicted value of x_11. When predicting the next time step x_12, should I append \hat{x_11} to the input of the encoder, i.e., should I use x_1, x_2,…,\hat{x_11} as the input of the encoder and \hat{x_11} as the input of the decoder to predict x_12?

Hi @fatcat, I think that is fine to use the output of the decoder (\hat{x_11) as a new input for your encoder since you don’t have ground truth for the for x_11. \hat{x_11 in this case will be your best approximation.

1 Like