LSTMSeq2seq for prediction series time

I am trying to make a neural network type LSTMseq2seq, where with 11 variables that I have I want to predict only 1, but I think that something in the structure is wrong, since it gives me negative r2 values

first_index = 746

last_index = 1910

Selecciona los datos del pozo

df_well = model.loc[first_index:last_index]

x = df_well.drop([‘NPD_WELL_BORE_CODE’,‘AVG_DOWNHOLE_PRESSURE’,‘DATEPRD’], axis=1)

y = df_well[[‘AVG_DOWNHOLE_PRESSURE’]]

train_size = int(0.90 * len(df_well))

x_train, y_train = x[:train_size], y[:train_size]

x_test, y_test = x[train_size:], y[train_size:]

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

class Seq2SeqModel(nn.Module):
def init(self, input_size, hidden_size, output_size, window_size, horizon):
super(Seq2SeqModel, self).init()
self.encoder = nn.LSTM(input_size, hidden_size,num_layers=11 )
self.decoder = nn.LSTM(hidden_size, hidden_size,num_layers=11)
self.linear = nn.Linear(hidden_size, output_size)

def forward(self, x):
    # Encoder
    _, (hidden, cell) = self.encoder(x)

    # Decoder
    decoder_input = torch.zeros_like(x)  # Inicializar con ceros
    output, _ = self.decoder(decoder_input, (hidden, cell))

    # Salida
    output = self.linear(output.squeeze(0))
    return output

Preparar datos

x_train_tensor = torch.tensor(x_train.values, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.float32)

x_test_tensor = torch.tensor(x_test.values, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test.values, dtype=torch.float32)

Crear conjuntos de datos y data loaders

train_dataset = TensorDataset(x_train_tensor, y_train_tensor)
test_dataset = TensorDataset(x_test_tensor, y_test_tensor)

batch_size = 40
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

Configurar modelo, función de pérdida y optimizador

#input_size = x_train.shape[1]
#output_size = y_train.shape[1]
#hidden_size = 11

#model = Seq2SeqModel(input_size, hidden_size, output_size) # Reemplaza ‘TuModeloLSTM’ con el nombre de tu modelo
#criterion = nn.MSELoss() # Puedes cambiar la funciĂłn de pĂ©rdida segĂșn tus necesidades
#optimizer = torch.optim.Adam(model.parameters(), lr=0.4) # Puedes ajustar el optimizador y la tasa de aprendizaje

Configurar modelo, función de pérdida y optimizador

input_size = x_train.shape[1]
output_size = y_train.shape[1]
hidden_size = 11
window_size = 40 # Ajusta segĂșn sea necesario
horizon = 40 # Ajusta segĂșn sea necesario

model = Seq2SeqModel(input_size, hidden_size, output_size, window_size, horizon)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

Entrenamiento

num_epochs=200
for epoch in range(num_epochs):
for batch_x, batch_y in train_loader:
optimizer.zero_grad()

    if isinstance(batch_x, torch.Tensor) and batch_x.is_sparse:
        batch_x = batch_x.to_dense()
    output = model(batch_x)
    loss = criterion(output, batch_y)
    loss.backward()
    optimizer.step()

# Imprimir la pérdida al final de cada época
print(f'Época {epoch + 1}/{num_epochs}, PĂ©rdida: {loss.item()}')  

Época 143/200, PĂ©rdida: 356.6650390625 Época 144/200, PĂ©rdida: 120.84323120117188 Época 145/200, PĂ©rdida: 407.1255798339844 Época 146/200, PĂ©rdida: 966.4486083984375 Época 147/200, PĂ©rdida: 778.9110107421875 Época 148/200, PĂ©rdida: 379.4190368652344 Época 149/200, PĂ©rdida: 201.9375762939453 Época 150/200, PĂ©rdida: 998.202880859375 Época 151/200, PĂ©rdida: 7103.08642578125 Época 152/200, PĂ©rdida: 583.3623046875 Época 153/200, PĂ©rdida: 581.837646484375 Época 154/200, PĂ©rdida: 197.0320281982422 Época 155/200, PĂ©rdida: 6781.46630859375 Época 156/200, PĂ©rdida: 149.66714477539062 Época 157/200, PĂ©rdida: 1184.738037109375 Época 158/200, PĂ©rdida: 1178.5792236328125 Época 159/200, PĂ©rdida: 150.41824340820312 Época 160/200, PĂ©rdida: 562.591552734375 Época 161/200, PĂ©rdida: 285.3613586425781 Época 162/200, PĂ©rdida: 273.0677490234375 Época 163/200, PĂ©rdida: 1095.458251953125 Época 164/200, PĂ©rdida: 6904.693359375 Época 165/200, PĂ©rdida: 269.6636047363281 Época 166/200, PĂ©rdida: 399.09637451171875 Época 167/200, PĂ©rdida: 87.79035186767578 Época 168/200, PĂ©rdida: 6977.03955078125 Época 169/200, PĂ©rdida: 881.690673828125 Época 170/200, PĂ©rdida: 299.3782653808594 Época 171/200, PĂ©rdida: 597.3984375 Época 172/200, PĂ©rdida: 854.1024780273438 Época 173/200, PĂ©rdida: 1044.854736328125 Época 174/200, PĂ©rdida: 99.43424987792969 Época 175/200, PĂ©rdida: 857.2242431640625 Época 176/200, PĂ©rdida: 1315.69384765625 Época 177/200, PĂ©rdida: 138.40518188476562 Época 178/200, PĂ©rdida: 373.2205810546875 Época 179/200, PĂ©rdida: 1005.6658325195312 Época 180/200, PĂ©rdida: 336.55047607421875 Época 181/200, PĂ©rdida: 225.5845184326172 Época 182/200, PĂ©rdida: 13401.2568359375 Época 183/200, PĂ©rdida: 194.87428283691406 Época 184/200, PĂ©rdida: 187.71463012695312 Época 185/200, PĂ©rdida: 6852.80712890625 Época 186/200, PĂ©rdida: 1094.6590576171875 Época 187/200, PĂ©rdida: 216.69117736816406 Época 188/200, PĂ©rdida: 13514.875 Época 189/200, PĂ©rdida: 741.24853515625 Época 190/200, PĂ©rdida: 268.2852783203125 Época 191/200, PĂ©rdida: 843.3907470703125 Época 192/200, PĂ©rdida: 243.44464111328125 Época 193/200, PĂ©rdida: 745.4027099609375 Época 194/200, PĂ©rdida: 6966.73779296875 Época 195/200, PĂ©rdida: 7471.0400390625 Época 196/200, PĂ©rdida: 345.9413757324219 Época 197/200, PĂ©rdida: 228.78480529785156 Época 198/200, PĂ©rdida: 1013.11279296875 Época 199/200, PĂ©rdida: 152.55259704589844 Época 200/200, PĂ©rdida: 791.8948364257812

from sklearn.metrics import r2_score, mean_absolute_error
import numpy as np

Poner el modelo en modo de evaluaciĂłn

model.eval()

Listas para almacenar las predicciones y etiquetas reales

predictions =
true_labels =

Realizar predicciones en el conjunto de prueba

with torch.no_grad():
for batch_x, batch_y in test_loader:
# Asegurarse de que batch_x sea denso
if isinstance(batch_x, torch.Tensor) and batch_x.is_sparse:
batch_x = batch_x.to_dense()

    # Obtener predicciones del modelo
    output = model(batch_x)
    
    # Almacenar predicciones y etiquetas reales
    predictions.append(output.numpy())
    true_labels.append(batch_y.numpy())

Concatenar y convertir las listas en tensores

predictions = np.concatenate(predictions, axis=0)
true_labels = np.concatenate(true_labels, axis=0)

Calcular la métrica de evaluación (por ejemplo, MSE)

mse = ((predictions - true_labels) ** 2).mean()

print(f’MSE en el conjunto de prueba: {mse}')

Calcular R^2

r2 = r2_score(true_labels, predictions)
print(f’R^2 en el conjunto de prueba: {r2}')

Someone who can help me fix my network, I would be very grateful and I would give you a small reward.

Please fix your formatting of your code; it’s very difficult read.

Apart from that, you encoder-decoder setup looks odd:

  • If your input sequence is of the length L, you decoder will return a sequence of length L

  • All the inputs of you decoder (i.e., the input at any time step is a zero vector), but the input of the (i+1)-th step should depend on the output of the i-th step. At least intuitively; I’m not 100% sure what your intended learning task is.

1 Like

The task I want to do is that with my 11 known variables I want to predict only one 1 variable, which in this case is in y

LSTM expects batch on dim=1, unless you specify batch_first=True when instantiating the layer.

What are the dims and sizes going into your model? For example: [batch_size, sequence_length, features]

Predicting 1 variable once or 1 variable for multiple time steps. For example, given some weather data, do you want to predict just the temperature for the next day or for the next k days.

If it’s the former, you wouldn’t need a encoder-decoder architecture.