【Time Series 】 The prediction result of LSTM is approximately straight line

hpf · October 10, 2020, 2:18pm

Hi,

Recently, I was working on a time series prediction project, using the RNN and LSTM modules of Pytorch.

I have a problem. When I use RNN, the prediction results are acceptable. But when I use LSTM, I get very poor results. 【PS:I use the same data structure, parameter structure, on RNN and LSTM.】

I try to change the amount of data per training, the number of hidden neurons and the number of layers in LSTM, but the predicted results can not fit the real data well.

Here are the prediction results I got：

The above graph shows the RNN prediction results. The green line represents the real data, and the red line represents the prediction result. I am quite satisfied with the result
Next

The above graph shows the LSTM prediction results.The green line represents the real data, and the blue line represents the prediction result. As you can see, the predicted result is almost a straight line.
When I zoom in on the prediction results， I found that the trend was like this.

【I don’t know why this is happening? Is there any solution that can help me solve this problem.】
——
Here are the main code snippets for LSTM

#every time, use the data of three time points  to predict the data of the next one time 
use_data=torch.Size([1790, 1, 3])
back_data=torch.Size([1790, 1, 1])

BATCH_SIZE=1
LR = 0.0003  
EPOCHS = 10  

train_set=data.TensorDataset(use_data,back_data)  
loader = data.DataLoader(dataset=train_set, batch_size=BATCH_SIZE, shuffle=False, num_workers=0) 

class lstm(nn.Module):
    def __init__(self):
        super(lstm, self).__init__()
        self.lstm = nn.LSTM(3,3) 
        self.linear = nn.Linear(3, 1)
     
    def forward(self, x,h):
        y1, h = self.lstm(x,h)
        y3 = self.linear(y1)
        return y3,h

NET = lstm()
optimizer = torch.optim.Adam(NET.parameters(), lr=LR)
loss_func = nn.MSELoss()

h_state = torch.randn(1,1,3) 
c_state=torch.randn(1,1,3)
hx=(h_state,c_state)

**lstm(
  (lstm): LSTM(3, 3)
  (linear): Linear(in_features=3, out_features=1, bias=True)
)



total_loss=[] 
wc_loss_plt=[]

NET.train()
for step in range(EPOCHS):
    wc_loss=[]
    pre=[]
    for i, (batch_x, batch_y) in enumerate(loader):  
        out, hx = NET(batch_x,hx)  

        hx1=hx[0].detach()  
        hx2=hx[1].detach()
        hx=(hx1,hx2)
        
        loss = loss_func(out, batch_y)
        pre.append(out)    ### prediction result
        wc_loss.append(loss.data)  
 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
    total_loss.append(sum(wc_loss))

ptrblck · October 12, 2020, 7:18am

You might need to continue the training. Since the predictions seem to already take the shape of the target the scaling might still need to be adjusted.
How are the training and validation loss behaving? Are both still decreasing or are you seeing a plateau?

hpf · November 26, 2020, 9:31am

I almost forgot about this question.I still don’t know what caused this problem.
However, I normalized the original data and got a good result.

I try to use the LSTM + Linear structure to train and predict the function y = X,
Only by normalizing x can I get good prediction results

so，I cant understand why normalization can be useful？ I think it just a method to shrink data.
Maybe I didn’t find the crux of the problem at all.

ptrblck · November 26, 2020, 7:33pm

Normalization helps the model training in general. E.g. one theoretical point of view is that whitening the data is creating loss surfaces with “round” valleys which accelerates the convergence. I’m pretty sure Bishop explains it nicely in Pattern Recognition and Machine Learning.

Aryaman_Pandya · December 18, 2023, 8:28pm

Moving to new issue.

Andrei_Fedorov · December 26, 2023, 8:39pm

I believe you understand that the prediction show on the plot doesnt predict anything in reality ?
Btw… I have success fully trainned bunch of lstm models with different kind of inputs (different features, normilised and not, sqeuence length, labels etc)… and in every scenario it end up with the prediction curve which looks fancy… but always !!! one step late against real curve.

bo_cheng · February 22, 2024, 2:43pm

why only normalize x can get a good result, what happened if you normlize both x and y?