Multi-Step time series LSTM Network

Jari_Peeperkorn · April 18, 2019, 9:41am

Ok thanks, i understand. I changed my code around a bit and now the initial input token is the last value of the known input data, should be a bit better then zero.

Marcel_Chamarelli · October 7, 2019, 11:07am

Hi, all
I was using this post as a guide to do the exact same thing (I think). I was just confused about the batches.
I have a large panel dataset of approximately 670 features along over 1800 days. My idea was to build a model that encodes sequences of these features (of length 45) and decodes another series, with length 7.
Is something as follows: I take X.iloc[n:n+45, :] (subsets of all features for 45 periods, for n=0, 1, …) and y[n+45:n+45+7] (a sequence of length 7 that starts just after the features end).
So a training pair for my model should be (X.iloc[n:n+45, :], y[n+45:n+45+7]) for some n.

I understood that in the example shown I should input these as batches (if n=0, 1, …, N I would have N+1 batches) and train with all the training sets every at every iteration, I am correct?
Because I have seen examples applied to NLP, and there are no batches (as in https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html). Instead every iteration of the training is done with a different training pair.

What is the difference between this two approaches? Did I get it right? Thanks for the help.

edwardredbean · March 5, 2020, 5:38pm

Hi, I am also trying to do the same thing. So in the end, did you lag your features by 6 hours?
For Chris’s example implementation, first 10 hour features input is matched and being trained to the future 6 hours output. Thanks for the help!

Sascha · September 30, 2020, 9:40am

Hi, I’m using pretty much the exact same code, but I’m interested in the output values as well, so I’ve extracted the evaluation process into the train() method.

class Encoder(nn.Module):

    def __init__(self, input_size, hidden_dim, num_layers=1):
        super(Encoder, self).__init__()
        print('Initializing Encoder...')


        self.input_size = input_size
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        # self.lstm = nn.LSTM(self.input_size, self.hidden_dim, num_layers=self.num_layers, batch_first=True)
        self.lstm = nn.LSTM(input_size=input_size, 
                          hidden_size=self.hidden_dim, 
                          num_layers=self.num_layers,
                          batch_first=True)  # Note that "batch_first" is set to "True"
        self.hidden = None

    def init_hidden(self, batch_size):
        return (torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(device),
                torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(device))

    def forward(self, inputs):
        # Push through RNN layer (the ouput is irrelevant)
        _, self.hidden = self.lstm(inputs, self.hidden)
        return self.hidden

class Decoder(nn.Module):

    def __init__(self, hidden_dim, num_layers=1):
        super(Decoder, self).__init__()
        print('Initializing Decoder...')
        # input_size=1 since the output are single values
        self.lstm = nn.LSTM(1, hidden_dim, num_layers=num_layers, batch_first=True)
        self.out = nn.Linear(hidden_dim, 1)

    def forward(self,decoder_input, outputs, hidden):
        batch_size, num_steps = outputs.shape
        input = decoder_input.view(batch_size,1)
        # Convert (batch_size, output_size) to (seq_len, batch_size, output_size)
        # In case of batch first
        input = input.unsqueeze(2)

        # loss = 0
        x = []
        for i in range(num_steps):
            # Push current input through LSTM: (seq_len=1, batch_size, input_size=1)
            output, hidden = self.lstm(input, hidden)
            # Push the output of last step through linear layer; returns (batch_size, 1)
        #     output = self.out(output[-1])
            # In case of batch first
            output = self.out(output[:, -1, :])
            # Generate input for next step by adding seq_len dimension (see above)
            input = output.unsqueeze(2)
            # Compute loss between predicted value and true value
        #     loss += criterion(output.squeeze(0), outputs[:, i])
            x.append(output)
        return torch.cat(x, dim=1)

I also have some previous values, so that I don’t initialize the Decoder with 0, but instead with previous values of the value that i want to predict.

Now i have a problem and a question:
First my problem: My output at some point is only NaNs, as well as the weights of my output layer in the Decoder. What would be possible reasons for that?

My first instinct would be to look at the activation function, but I’m not sure, how i should add an activation function in this case.

Chin_GM · October 8, 2021, 7:36pm

Hi Chris,

Good day.
I am also trying on the multi step forecasting.

I tried your posted example here.
But I got an error as follow:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py:528: UserWarning: Using a target size (torch.Size([2])) that is different to the input size (torch.Size([2, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.mse_loss(input, target, reduction=self.reduction)

May I know how to solve this problem?
Thanks

Nyakov · October 3, 2022, 11:42am

I am a bit confuse, why we are doing unfolding of decoder by hands, is there another way to do it in pytorch?

Nyakov · October 3, 2022, 4:53pm

I am try to do the same thing, is it valid thing to do?

Gradient explosion?