LSTM predicting inputs possible solution

I have an LSTM for time seris predictions, and its problem is that it just “predicts” input values

Is calculating loss not only with labels (label is input sequence but shifted 1 step further) but with inputs too? So model would try not only to minimize difference between output and labels, but also maximize difference between output and input?

Something like loss = loss_function(y_pred,labels) + 1 / loss_function(y_pred,labels)

For some reason with this configuration it doesnt work, but is this a good/functional idea?