Predicted future values with LSTM forecasting giving same results

Chathurangi_Shyalika · December 28, 2023, 3:02am

I’m working on an LSTM model for time-series forecasting.
My dataset has two variables that include sales values for two companies. The dataset has ten rows. I wanted to forecast the 11th row. As the output for both of the variables, I end up getting the same tensor value. Any ideas on what I have missed and how to improve the results?

The inputs tensor, label tensor, output tensor are as follows.

Inputs:

tensor([[ 1431,  1424,  1483,  ...,  1432,  1334,  1413],
        [11574, 11613, 11671,  ..., 11597, 11628, 11644]])

Label(11th row values): tensor([ 1472, 11602])

Output what I’m getting is: tensor([5893.2, 5893.2])

My nn.LSTM class is as follows.

class LSTMnn(L.LightningModule):
  def __init__(self):
    super().__init__()

    self.lstm = nn.LSTM(input_size=1, hidden_size=256, num_layers=16)
    self.lstm1 = nn.LSTM(input_size=256, hidden_size=128, num_layers=16)
    self.lstm2 = nn.LSTM(input_size=128, hidden_size=64, num_layers=10)
    self.lstm3 = nn.LSTM(input_size=64, hidden_size=32, num_layers=10)
    self.lstm4 = nn.LSTM(input_size=32, hidden_size=8, num_layers=10)
    self.linear = nn.Linear(8, 1)
    self.loss_fn = nn.MSELoss()

  def forward(self,input):
    lstm_out, _ = self.lstm(input.view(len(input), 1, -1))
    lstm_out, _ = self.lstm1(lstm_out)
    lstm_out, _ = self.lstm2(lstm_out)
    lstm_out, _ = self.lstm3(lstm_out)
    lstm_out, _ = self.lstm4(lstm_out)
    prediction = self.linear(lstm_out[-1].view(1, -1))
    return prediction

  def configure_optimizers(self): 
    return Adam(self.parameters(), lr=0.1) 

 def training_step(self, batch, batch_idx):
        input_i, label_i = batch
        output_i = self.forward(input_i[0])
        loss = self.loss_fn(output_i, label_i)
        self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
        return loss```

vdw · December 28, 2023, 7:19am

Can you clarify if your model trains properly, i.e., the training loss goes significantly down. I’m also not sure what you mean by “getting the same tensor value”? It’s always the same interdependent from the input?

Since you’re using view() a couple of time, you may want to check if it does the correct thing.

I’m also not sure about the last view() in general. lstm_out will have a shape of (seq_len, batch_size, 8). This means that lstm_out[-1] will have a shape of (batch_size, 8). Not sure why you do a view(1,-1) on that as this should return a shape of (1, batch_size*8). Am I misreading something?

Lastly you have 5 nn.LSTM layers each stack with 10/16 layers. That seems overkill and might take very long to learn properly.

Chathurangi_Shyalika · December 28, 2023, 9:31am

Thank you for the reply.
The loss is reducing significantly. I made some changes to the model, where I have reduced the number of LSTM layers and now I have 2 outputs that correspond to the 2 variables. Now my prediction seems different but the values are almost similar.

New prediction:

tensor([1384.5271, 1384.7784])

Actual value should be:

tensor([ 1614., 11757.])

May I please know if I have done anything wrong with the model?

class LSTMnn(L.LightningModule):
  def __init__(self):
    super().__init__()

    self.lstm = nn.LSTM(input_size=1, hidden_size=64, num_layers=4)
    self.lstm1 = nn.LSTM(input_size=64, hidden_size=32, num_layers=2)
    self.lstm2 = nn.LSTM(input_size=32, hidden_size=8, num_layers=2)
    self.linear = nn.Linear(8, 2)
    self.loss_fn = nn.MSELoss()

  def forward(self,input):
    lstm_out, _ = self.lstm(input.view(len(input), 1))
    lstm_out, _ = self.lstm1(lstm_out)
    lstm_out, _ = self.lstm2(lstm_out)
    prediction = self.linear(lstm_out[-1])
    return prediction

  def configure_optimizers(self):
    return Adam(self.parameters(), lr=0.1)

  def training_step(self, batch, batch_idx):
        input_i, label_i = batch
        output_i = self.forward(input_i[0])
        loss = self.loss_fn(output_i, label_i)
        self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
        return loss

vdw · December 28, 2023, 11:49am

What is the shape if input at the beginning of forward(). According to your initial description each sequence contains of 2d data points, but your first LSTM has input_size=1 where I would have expected input_size=2.

Chathurangi_Shyalika · December 28, 2023, 2:31pm

@vdw The shape if input at the beginning of forward() is (sequence_length, batch_size, input_size) which is 10,2,1.

sequence_length is 10 (the length of the series).
batch_size is 2 (the number of sequences in your input tensor).
input_size is 1 (since I'm taking the whole series as one input)

I have put the first LSTM input as 1 since I’m considering the whole sequence as one series.
Isn’t it a correct way? May I please know how the layer inputs and outputs should be in this case?

vdw · December 29, 2023, 10:33am

Does that mean that batch_size=2 reflects that you have data from 2 companies?

I assume your inputs look like (seq_len=10, batch_size=32, input_size=2)? If you do not treat both companies as 1 data sample, why do you want that your model predicts 2 values?

Still, assuming that your shape is (seq_len=10, batch_size=2, input_size=1), why do you need to do input.view(len(input), 1)? Why doesn’t this throw an error? What’s the output when you add the line

print(input.view(len(input), 1).shape)

at the beginning of your forward() method?

Chathurangi_Shyalika · December 29, 2023, 11:18am

Yes, I thought batch_size=2, to indicate data from both companies.

When I check the input, I’m getting an error.
RuntimeError: shape '[2, 1]' is invalid for input of size 20

@vdw Would you please suggest me an approach I can use here?

UPDATE: So, I did change the input shape to the model as torch.Size([2, 10, 1]), which is number of variables=2(which means 2 companies), sequence length, input variable(which is 0 for company A, 1 for company B). So, we can have input size=1, right? would this be a correct approach?