LSTM: Prediction does not change when looping over test data

I am trying to predict a variable using 7 features in time steps of 4 using an LSTM model. I’m a beginner with this and I am running into some difficulties.

Data

# Shape X_train: torch.Size([24433, 4, 7]
# Shape Y_train: torch.Size([24433, 4, 1]

# Shape X_test: torch.Size([6109, 4, 7]
# Shape Y_test: torch.Size([6109, 4, 1]

train_dataset = TensorDataset(X_train, Y_train)
test_dataset = TensorDataset(X_test, Y_test) 

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32, shuffle=False)

Example of data:

print(train_dataset[0], test_dataset[0])

(tensor([[ 7909.0000,  8094.0000,  9119.0000,  8666.0000, 17599.0000, 13657.0000,
         10158.0000],
        [ 7909.0000,  8073.0000,  9119.0000,  8636.0000, 17609.0000, 13975.0000,
         10109.0000],
        [ 7939.5000,  8083.5000,  9166.5000,  8659.5000, 18124.5000, 13971.0000,
         10142.0000],
        [ 7951.0000,  8064.0000,  9201.0000,  8663.0000, 17985.0000, 13967.0000,
         10076.0000]]), tensor([[41.],
        [41.],
        [41.],
        [41.]]))

(tensor([[ 8411.0000,  8530.0000,  9439.0000,  9101.0000, 17368.0000, 14174.0000,
         11111.0000],
        [ 8460.0000,  8651.5000,  9579.5000,  9355.5000, 17402.0000, 14509.0000,
         11474.5000],
        [ 8436.0000,  8617.0000,  9579.0000,  9343.0000, 17318.0000, 14288.0000,
         11404.0000],
        [ 8519.0000,  8655.0000,  9580.0000,  9348.0000, 17566.0000, 14640.0000,
         11404.0000]]), tensor([[59.],
        [59.],
        [59.],
        [59.]]))

LSTM model

class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.linear = nn.Linear(hidden_size, output_size)
        
    def forward(self, x):
        x, _ = self.lstm(x)
        x = self.linear(x)
        return x

model = LSTMModel(input_size=7, hidden_size=256, output_size=1)

loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

Looping over data

pred_train = []
true_train = []
 
model.train()

# Loop over the training set
for X, Y in train_loader:

    optimizer.zero_grad()
    
    Y_pred = model(X)
    
    pred_train.append(Y_pred)
    true_train.append(Y) 

    loss = loss_fn(Y_pred, Y)
    
    loss.backward()
    
    optimizer.step()

model.eval()

pred_test = []
true_test = [] 

# Loop over the test set
for X, Y in test_loader:

    Y_pred = model(X)
    
    pred_test.append(Y_pred)
    true_test.append(Y)
    
    loss = loss_fn(Y_pred, Y)

Checking predictions

print(true_train[0], pred_train[0]) # or i, goes for every iteration
print(true_test[0], pred_test[0])

I get (shortened):

# True train data (L) & predicted train data (R)

tensor([[[  3.],     tensor([[[ 0.1095],
         [  3.],     [ 0.0221],
         [  3.],     [ 0.0087],
         [  3.]],    [-0.0308]],

        [[100.],     [[ 0.0922],
         [  0.],     [ 0.0395],
         [  0.],     [-0.0423],
         [  0.]],    [-0.0592]],

        [[ 57.],     [[ 0.0228],
         [ 57.],     [-0.0332],
         [ 57.],     [ 0.0296],
         [ 57.]],    [ 0.0018]],

         ...         ...

# True test data (L) & predicted test data (R)

tensor([[[ 59.],     tensor([[[20.6179],
         [ 59.],     [20.6179],
         [ 59.],     [20.6179],
         [ 59.]],    [20.6179]],

        [[ 70.],     [[23.4562],
         [ 70.],     [23.4562],
         [ 70.],     [23.4562],
         [ 70.]],    [23.4562]],

        [[  0.],     [[23.8913],
         [  0.],     [23.8913],
         [  0.],     [23.8913],  
         [  0.]],    [23.8913]],

         ...         ...

                     [[23.9606],
                     [23.9606],
                     [23.9606],
                     [23.9606]],

Also interesting regarding the training predictions:

print(pred_train[0], pred_train[5], pred_train[10])
tensor([[[ 0.1095],
         [ 0.0221],
         [ 0.0087],
         [-0.0308]],

        [[ 0.0922],
         [ 0.0395],
         [-0.0423],
         [-0.0592]],
...
tensor([[[18.4983],
         [18.4983],
         [18.4983],
         [18.4983]],

        [[20.6157],
         [21.0552],
         [21.0552],
         [21.0552]],
...

tensor([[[25.8706],
         [25.8706],
         [25.8706],
         [25.8706]],

        [[29.2633],
         [29.2633],
         [29.2633],
         [29.2633]],
...

The further the iteration, the higher the predictions in the training loop seem to become.

My question

As you can see, the predictions (output) made in the test loop remain (~) the same. Eventually, they become constant: 23.9606.

But why is the output the same for every iteration in the test loop, and why do the predictions become higher in the training loop? What am I doing wrong/what should I be doing to get correct output?