I am having issues using an LSTM. I get the error RuntimeError: input must have 3 dimensions, got 2
. I have looked online, but I am unable to get it to work. I am fairly new to PyTorch, so any help is appreciated. Here is my code:
class TeacherUpdated(nn.Module):
def __init__(self):
# python requires to call ancestor's initilizer manually!
super(TeacherUpdated, self).__init__()
self.lstm = nn.LSTM(53, 200, 3, batch_first=True).double()
self.linear = nn.Linear(200, 2)
def forward(self, x):
x, hs = self.lstm(x,200)
x = x.reshape(-1, 200)
x = self.linear(x)
return x
The lstm expects a input of shape sequence length, batch size, input size. Can you print out the shape of your x? That is probably the problem.
1 Like
This is my shape: [10799, 53]
Ok what do those numbers represent. Like is the first one supposed to be the sequence length or batch size or something. Because that is the problem the lstm requires a input with sequence length , batch size, input size.
The x
is the data being passed into the forward function. self.lstm = nn.LSTM(53, 200, 3, batch_first=True).double()
is input size, hidden layers, and num of layers respectively. There is something wrong with x, hs = self.lstm(x,200)
and I may not be calling this correctly.
what is the 200 hundred for. To define a hidden and cell state you should do something like this
def init_state(self, batch_size):
return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
torch.zeros(self.num_layers, batch_size, self.hidden_size))
then do
hs = self.init_state(batch_size)
x, hs = self.lstm(x,hs)
I tried this and I still get the same error. This is what I have:
def __init__(self):
# python requires to call ancestor's initilizer manually!
super(TeacherUpdated, self).__init__()
self.num_layers = 3
self.hidden_size = 200
self.lstm = nn.LSTM(53, self.hidden_size, self.num_layers, batch_first=True).double()
self.linear = nn.Linear(200, 2)
def init_state(self, batch_size):
return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
torch.zeros(self.num_layers, batch_size, self.hidden_size))
def forward(self, x):
hs = self.init_state(200)
x, hs = self.lstm(x, hs)
x = x.reshape(-1, 200)
x = self.linear(x)
return x
Ok well the shape of your x input is still wrong. How do you pass the input into the model?
As stated before, the size of x is [10799, 53]
. This is how I instantiate the model class: teacher_model = TeacherUpdated()
This is how the start of my for loop looks that passes the x into the model:
for step_count, (x, y_gt) in enumerate(train_loader): #(x, y_gt)
# Initialize gradients with 0
optimizer.zero_grad()
x= pd.DataFrame(data=x)
x= torch.tensor(x.values)
x = x.to(device)
y_gt = y_gt.to(device)
# Predict
x = torch.flatten(x,start_dim=1, end_dim=-1)
y_pred = teacher_model(x)
Ok can you print the shape of x before you flatten it and also what is the 10799 in the shape represent?
The 10799 is the number of data in the training set. The shape of x before flattening is still [10799, 53]
why are you not breaking it into batches? That is your problem you need to have a batch size. You can either break the data into batches or do x.unsqueeze(0)
I tried x.unsqueeze(0)
in the forward before it gets passed into the LSTM, but that did not work. Doesnt this create a batch size?
Yes it does. Your inputs only have two shapes though. What happened when you unsqueezed it?
Surprisingly, the size of x does not change when I unsqueeze it.
That is weird. Can you send the code when you unsqueeze it?
I just unsqueeze it in my forward function
def forward(self, x):
x.unsqueeze(0)
print(x.size())
hs = self.init_state(200)
x, hs = self.lstm(x, hs)
x = x.reshape(-1, 200)
x = self.linear(x)
return x
Yep, not sure how I missed that one. I now get the error RuntimeError: Expected hidden[0] size (3, 1, 200), got [3, 200, 200]
and the size of x is [1, 10799, 53]
. Does this error have something to do with
def init_state(self, batch_size):
return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
torch.zeros(self.num_layers, batch_size, self.hidden_size))
No it is because your batch_size is only 1. So when you define your hidden state change the input parameter to 1.