I’m currently working on building an LSTM model to forecast time-series data using PyTorch. I used lag features to pass the previous n steps as inputs to train the network. I split the data into three sets, i.e., train-validation-test split, and used the first two to train the model. My validation function takes the data from the validation data set and calculates the predicted valued by passing it to the LSTM model using DataLoaders and TensorDataset classes. Initially, I’ve got pretty good results with R2 values in the region of 0.85-0.95.
However, I have an uneasy feeling about whether this validation function is also suitable for testing my model’s performance. Because the function now takes the actual X values, i.e., time-lag features, from the DataLoader to predict y^ values, i.e., predicted target values, instead of using the predicted y^ values as features in the next prediction. This situation seems far from reality where the model has no clue of the real values of the previous time steps, especially if you forecast time-series data for longer time periods, say 3-6 months.
I’m currently a bit puzzled about how to tackle this issue and define a function to predict future values relying on the model’s values rather than the actual values in the test set. I have the following function
predict, which makes a one-step prediction, but I haven’t really figured out how to predict the whole test dataset using DataLoader.
def predict(self, x): # convert row to data x = x.to(device) # make prediction yhat = self.model(x) # retrieve numpy array yhat = yhat.to(device).detach().numpy() return yhat
You can find how I split and load my datasets, my constructor for the LSTM model, and the validation function below. If you need more information, please do not hesitate to reach out to me.
Splitting and Loading Datasets
def create_tensor_datasets(X_train_arr, X_val_arr, X_test_arr, y_train_arr, y_val_arr, y_test_arr): train_features = torch.Tensor(X_train_arr) train_targets = torch.Tensor(y_train_arr) val_features = torch.Tensor(X_val_arr) val_targets = torch.Tensor(y_val_arr) test_features = torch.Tensor(X_test_arr) test_targets = torch.Tensor(y_test_arr) train = TensorDataset(train_features, train_targets) val = TensorDataset(val_features, val_targets) test = TensorDataset(test_features, test_targets) return train, val, test def load_tensor_datasets(train, val, test, batch_size=64, shuffle=False, drop_last=True): train_loader = DataLoader(train, batch_size=batch_size, shuffle=shuffle, drop_last=drop_last) val_loader = DataLoader(val, batch_size=batch_size, shuffle=shuffle, drop_last=drop_last) test_loader = DataLoader(test, batch_size=batch_size, shuffle=shuffle, drop_last=drop_last) return train_loader, val_loader, test_loader
class LSTMModel(nn.Module): def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, dropout_prob): super(LSTMModel, self).__init__() self.hidden_dim = hidden_dim self.layer_dim = layer_dim self.lstm = nn.LSTM( input_dim, hidden_dim, layer_dim, batch_first=True, dropout=dropout_prob ) self.fc = nn.Linear(hidden_dim, output_dim) def forward(self, x, future=False): h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_() c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_() out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach())) out = out[:, -1, :] out = self.fc(out) return out
Validation (defined within a trainer class)
def validation(self, val_loader, batch_size, n_features): with torch.no_grad(): predictions =  values =  for x_val, y_val in val_loader: x_val = x_val.view([batch_size, -1, n_features]).to(device) y_val = y_val.to(device) self.model.eval() yhat = self.model(x_val) predictions.append(yhat.cpu().detach().numpy()) values.append(y_val.cpu().detach().numpy()) return predictions, values