LSTM model not learning

Hello everyone,

I did some research but I couldn’t find any solutions at the moment.

I am trying to make categorical prediction of a time series dataset. I have a train dataset with the follow size:
torch.Size([3749, 1, 62]): No. of samples, windows of 1 day, 62 features
labels:
torch.Size([3749]) with category 0,1,2

This is my model:


class LSTM(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(LSTM, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim) 
        
    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim)
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim)
        out, (hn, cn) = self.lstm(x, (h0, c0))
        out = self.fc(out) 
        out = F.softmax(out,dim=1)
        return out

This is the train loop:


input_dim = 62
hidden_dim = 80
num_layers = 1
output_dim = 1
num_epochs = 10

model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
criterion = torch.nn.MSELoss()
optimiser = torch.optim.SGD(model.parameters(), lr=0.01)

hist = np.zeros(10)

lstm = []

for t in range(num_epochs):
    optimiser.zero_grad()
    y_train_pred = model(X_train)    

    loss = criterion(y_train_pred, y_train.reshape(y_train.shape[0],1,1))
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item() 

    loss.backward()
    optimiser.step()

This is the output:

Epoch  0 MSE:  0.23072819411754608
Epoch  1 MSE:  0.23072819411754608
Epoch  2 MSE:  0.23072819411754608
Epoch  3 MSE:  0.23072819411754608
Epoch  4 MSE:  0.23072819411754608
Epoch  5 MSE:  0.23072819411754608
Epoch  6 MSE:  0.23072819411754608
Epoch  7 MSE:  0.23072819411754608
Epoch  8 MSE:  0.23072819411754608
Epoch  9 MSE:  0.23072819411754608

As you can see the model does not learn and I happen the same even with more epochs.

Do you have any ideas?

Thanks

This is happening because in your training loop you are only looping over your epochs and not your dataset. Because of this X_train never changes so your model predicts the same thing every time and the loss will not change. You need to also loop over your dataset by doing something like this

for t in range(num_epochs):
    for batch_idx, (inputs, targets) in enumerate(dataloader):
         optimiser.zero_grad()
         y_train_pred = model(inputs)    

         loss = criterion(y_train_pred, targets.reshape(targets.shape[0],1,1))
         loss.backward()
        optimiser.step()
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item()

Whatever you dataset is you need to loop through it so you get different batches.

many thanks for your reply. This is also something I have been trying:

lstm = []

for t in range(num_epochs):
    for inputs, targets in zip(X_train,y_train):
         optimiser.zero_grad()
         y_train_pred = model(inputs)    

         loss = criterion(y_train_pred, targets.reshape(targets.shape[0],1,1))
         loss.backward()
         optimiser.step()
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item()

However using my actual dataset I got this error:
RuntimeError: input must have 3 dimensions, got 2

I should reshape my dataset to avoid this error?

Yes in your training loop you can just say

input = input.unsqueeze(1)
target = target.unsqueeze(1) 

and that should help.

Hi,
thanks for your reply.

I solved using TensorDataset and DataLoader:

train_tensor = torch.utils.data.TensorDataset(X_train,y_train)
train_loader = torch.utils.data.DataLoader(dataset = train_tensor, batch_size = 1)

Now the loop looks like this:

hist = np.zeros(10)
lstm = []
for t in range(num_epochs):
    for batch_idx, (inputs, targets) in enumerate(train_loader):
         #model.zero_grad()
         optimiser.zero_grad()
         y_train_pred = model(inputs)    

         loss = criterion(y_train_pred.reshape(1), targets.reshape(1))
         loss.backward()
         optimiser.step()
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item()

However, the result looks a bit different but it still does not learn:

Epoch  0 MSE:  1.0
Epoch  1 MSE:  1.0
Epoch  2 MSE:  1.0
Epoch  3 MSE:  1.0
Epoch  4 MSE:  1.0
Epoch  5 MSE:  1.0
Epoch  6 MSE:  1.0
Epoch  7 MSE:  1.0
Epoch  8 MSE:  1.0
Epoch  9 MSE:  1.0

I can only imagine that dataset shape is not appropriate or something in the model.
Do you have any idea?

Thanks

Why are you reshaping the input to the loss function? Also why is your batch_size only 1?

Hi,
I solved the problem. As I mentioned at the very beginning this i classification problem with multiple classes. This is a time series where each step is one day.

I changed everything in this way and it works quite well:

class LSTM(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(LSTM, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers)
        self.fc = nn.Linear(hidden_dim, output_dim) 
        
    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim)
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim)
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        out = self.fc(out[-1]) 
        #out = F.softmax(out,dim=1)
        return out

input_dim = 62
hidden_dim = 80
num_layers = 1
output_dim = 3 #No of classes
num_epochs = 10

model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
criterion = torch.nn.CrossEntropyLoss()
optimiser = torch.optim.Adam(model.parameters(), lr=0.01)

hist = np.zeros(10)
lstm = []
for t in range(num_epochs):
    for batch_idx, (inputs, targets) in enumerate(train_loader):
         #model.zero_grad()
         optimiser.zero_grad()
         y_train_pred = model(inputs)    

         loss = criterion(y_train_pred, targets.long())
         loss.backward()
         optimiser.step()
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item()

Now I need to find out how to make the prediction to know the predicted category.