I am training an LSTM model on timeseries data to build a classifier. So when I do batch training with for example batch size = 8. How does my model know when one sequence ends and another one starts? I have read a couple times that I would need to reset hidden and cell state inside my model, but I am not sure if this is correct or not, as I have seen some people not doing it. Either way I have tried it and the Problem I have is that when initializing h0 und c0 in my model it expects the size of (Num_layers, batch_size, hidden size) but as I get to the end of my traindataset not enough data is available for a whole batch so i get the error Expected hidden[0] size (1, 14, 64), got [1, 16, 64].
So basically my questions just are:
- How does my model know when one sequence ends and another one starts?
- How do I implement this?
My model without resetting the states looks like this:
class Model_LSTM(nn.Module):
def __init__(self, n_features, n_classes, n_hidden, n_layers): # bidirectional möglich
super().__init__()
self.lstm = nn.LSTM(
input_size=n_features,
hidden_size=n_hidden,
num_layers=n_layers,
batch_first=True,
dropout=0.75
)
self.classifier = nn.Linear(n_hidden, n_classes)
def forward(self, x):
_, (hn, _) = self.lstm(x)
out=hn[-1]
return self.classifier(out)
With resetting the States (Which does not work) it looks like this:
class Model_LSTM(nn.Module):
def __init__(self, n_features, n_classes, n_hidden, n_layers): # bidirectional möglich
super().__init__()
self.lstm = nn.LSTM(
input_size=n_features,
hidden_size=n_hidden,
num_layers=n_layers,
batch_first=True,
dropout=0.75
)
self.classifier = nn.Linear(n_hidden, n_classes)
def forward(self, x):
h0 = torch.zeros(self.lstm.num_layers,batch_size,self.lstm.hidden_size)
c0 = torch.zeros(self.lstm.num_layers,batch_size,self.lstm.hidden_size)
_, (hn, cn) = self.lstm(x, (h0,c0))
out=hn[-1]
return self.classifier(out)