RuntimeError: input must have 3 dimensions, got 2 LSTM

I am having issues using an LSTM. I get the error RuntimeError: input must have 3 dimensions, got 2. I have looked online, but I am unable to get it to work. I am fairly new to PyTorch, so any help is appreciated. Here is my code:

class TeacherUpdated(nn.Module):
    def __init__(self):
        # python requires to call ancestor's initilizer manually!
        super(TeacherUpdated, self).__init__()
        self.lstm = nn.LSTM(53, 200, 3, batch_first=True).double()
        self.linear = nn.Linear(200, 2)

    def forward(self, x):
        x, hs = self.lstm(x,200)
        x = x.reshape(-1, 200)
        x = self.linear(x)
        return x

The lstm expects a input of shape sequence length, batch size, input size. Can you print out the shape of your x? That is probably the problem.

1 Like

This is my shape: [10799, 53]

Ok what do those numbers represent. Like is the first one supposed to be the sequence length or batch size or something. Because that is the problem the lstm requires a input with sequence length , batch size, input size.

The x is the data being passed into the forward function. self.lstm = nn.LSTM(53, 200, 3, batch_first=True).double() is input size, hidden layers, and num of layers respectively. There is something wrong with x, hs = self.lstm(x,200) and I may not be calling this correctly.

what is the 200 hundred for. To define a hidden and cell state you should do something like this

 def init_state(self, batch_size):
        return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
                torch.zeros(self.num_layers, batch_size, self.hidden_size))

then do

hs = self.init_state(batch_size)
x, hs = self.lstm(x,hs)

I tried this and I still get the same error. This is what I have:

    def __init__(self):
        # python requires to call ancestor's initilizer manually!
        super(TeacherUpdated, self).__init__()
        self.num_layers = 3
        self.hidden_size = 200
        self.lstm = nn.LSTM(53, self.hidden_size, self.num_layers, batch_first=True).double()
        self.linear = nn.Linear(200, 2)

    def init_state(self, batch_size):
        return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
                torch.zeros(self.num_layers, batch_size, self.hidden_size))

    def forward(self, x):
        hs = self.init_state(200)
        x, hs = self.lstm(x, hs)
        x = x.reshape(-1, 200)
        x = self.linear(x)
        return x

Ok well the shape of your x input is still wrong. How do you pass the input into the model?

As stated before, the size of x is [10799, 53]. This is how I instantiate the model class: teacher_model = TeacherUpdated() This is how the start of my for loop looks that passes the x into the model:

for step_count, (x, y_gt) in enumerate(train_loader): #(x, y_gt)
        # Initialize gradients with 0
        optimizer.zero_grad()

        x= pd.DataFrame(data=x)
        x= torch.tensor(x.values)
        x = x.to(device)
        y_gt = y_gt.to(device)

        # Predict
        x = torch.flatten(x,start_dim=1, end_dim=-1)
        y_pred = teacher_model(x)

Ok can you print the shape of x before you flatten it and also what is the 10799 in the shape represent?

The 10799 is the number of data in the training set. The shape of x before flattening is still [10799, 53]

why are you not breaking it into batches? That is your problem you need to have a batch size. You can either break the data into batches or do x.unsqueeze(0)

I tried x.unsqueeze(0) in the forward before it gets passed into the LSTM, but that did not work. Doesnt this create a batch size?

Yes it does. Your inputs only have two shapes though. What happened when you unsqueezed it?

Surprisingly, the size of x does not change when I unsqueeze it.

That is weird. Can you send the code when you unsqueeze it?

I just unsqueeze it in my forward function

    def forward(self, x):
        x.unsqueeze(0)
        print(x.size())
        hs = self.init_state(200)
        x, hs = self.lstm(x, hs)
        x = x.reshape(-1, 200)
        x = self.linear(x)
        return x

Ok you might have to do

x = x.unsqueeze(0)

Yep, not sure how I missed that one. I now get the error RuntimeError: Expected hidden[0] size (3, 1, 200), got [3, 200, 200] and the size of x is [1, 10799, 53]. Does this error have something to do with

def init_state(self, batch_size):
        return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
                torch.zeros(self.num_layers, batch_size, self.hidden_size))

No it is because your batch_size is only 1. So when you define your hidden state change the input parameter to 1.