Input format for LSTM in consideration to rolling window

Hello,

I am a beginner and working on a time series prediction for stock prices. However, using the rolling window technique I am fetching 20 days of the stock price and inserting the values into a tensor and labelling the 21 day as the day I want to predict. Currently I am only using the closing price of the stock, hence it is univariate. The input tensor that is passed to the forward method looks like the following:

tensor([[[0.6675],
[0.7126],
[0.7064],
[0.7311],
[0.7539],
[0.6859],
[0.7271],
[0.7601],
[0.7252],
[0.7723],
[0.6569],
[0.6262],
[0.7044],
[0.7456],
[0.7087],
[0.7126],
[0.6510],
[0.6694],
[0.6859],
[0.7417]]])

My questions are:

  1. The output is in the size torch.Size([20, 1]) after the linear layer and to extract the correct value I am taking the last value of the tensor of the linear layer, t = t[-1]. Is this the correct way to do it?

  2. The documentation writes that the input tensor should be: (seq_len, batch, input_size). I am wondering if my rolling window tensor correct? What confuses me is that I assume the sequence length should be 20? But by printing my tensor shape I obtain: torch.Size([1, 20, 1]). Should I reshape the input into torch.Size([20, 1, 1])?

  3. In regards to the first question is there a way to implement the model so I do not have to extract the last value according to t = t[-1]? In other words can the linear layer directly give the last value, the 21st prediction or should I keep it as it is?

I would really appreciate it if someone could point me into the right direction or drop a link to a good source that explains the questions I have. Below is the model:

class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size,batch_size):
        super(LSTM, self).__init__()
        self.input_size = input_size  #1
        self.hidden_size = hidden_size # 32
        self.num_layers = num_layers # 2
        self.output_size = output_size # 1
        self.batch_size = batch_size # 1
        self.lstm = nn.LSTM(self.input_size, self.hidden_size, self.num_layers)
        self.linear = nn.Linear(hidden_size, output_size)
        
    def hidden_cell(self,t):
        h_0,c_0 = (torch.zeros(self.num_layers, t.shape[1], self.hidden_size), 
        torch.zeros(self.num_layers, t.shape[1], self.hidden_size))
        return h_0,c_0 
        
    def forward(self, t):
        t, (h_n,c_n) = self.lstm(t, (self.hidden_cell(t)))
        t = t.view(-1, self.hidden_size)
        t = self.linear(t)
        t = t[-1]
        return t

Thank you.

The problem that I currently see is that you use the output of the self.lstm for the next layer. Let me use different names to avoid confusion:

 output, (h_n,c_n) = self.lstm(t, (self.hidden_cell(t)))

Now output should has the shape (20, 1, 32); note that output gives you the hidden states for all 20 steps. The problem is that after:

t = output.view(-1, self.hidden_size)

t has the shape (20, 32). For the linear layer that means that you give it a batch of size 20. The linear layer doesn’t now or care that the 20 comes from the number of steps. It’s a semantic error. Hence your output shape is (20, 1). But what you want is (1, 1) since your batch size and output size is 1.

You have to use only the last layer of last hidden state h_n[-1] instead of output. h_n has the shape (2, 1, 32) since you have 2 layers. This h_n[-1] has the shape (1, 32). And then you also don’t need the view() no more. You can do directly

t = self.linear(h_n[-1])

Which will give you the right final output with a shape of (1, 1) or (batch_size, 1) once you have batches with more then one sequence.

Thank you, your advice helped me so much!