How do I tell my model that a new sequence starts when batch training timeseries?

Patrick1 · February 9, 2022, 7:18pm

I am training an LSTM model on timeseries data to build a classifier. So when I do batch training with for example batch size = 8. How does my model know when one sequence ends and another one starts? I have read a couple times that I would need to reset hidden and cell state inside my model, but I am not sure if this is correct or not, as I have seen some people not doing it. Either way I have tried it and the Problem I have is that when initializing h0 und c0 in my model it expects the size of (Num_layers, batch_size, hidden size) but as I get to the end of my traindataset not enough data is available for a whole batch so i get the error Expected hidden[0] size (1, 14, 64), got [1, 16, 64].

So basically my questions just are:

How does my model know when one sequence ends and another one starts?
How do I implement this?

My model without resetting the states looks like this:

class Model_LSTM(nn.Module):

    def __init__(self, n_features, n_classes, n_hidden, n_layers):            # bidirectional möglich
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=n_features,
            hidden_size=n_hidden,
            num_layers=n_layers,
            batch_first=True,
            dropout=0.75
        )

        self.classifier = nn.Linear(n_hidden, n_classes)

    def forward(self, x):
        _, (hn, _) = self.lstm(x)                
        out=hn[-1]                                 
        return self.classifier(out)

With resetting the States (Which does not work) it looks like this:

class Model_LSTM(nn.Module):

    def __init__(self, n_features, n_classes, n_hidden, n_layers):            # bidirectional möglich
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=n_features,
            hidden_size=n_hidden,
            num_layers=n_layers,
            batch_first=True,
            dropout=0.75
        )

        self.classifier = nn.Linear(n_hidden, n_classes)

    def forward(self, x):
        h0 = torch.zeros(self.lstm.num_layers,batch_size,self.lstm.hidden_size)
        c0 = torch.zeros(self.lstm.num_layers,batch_size,self.lstm.hidden_size)

        _, (hn, cn) = self.lstm(x, (h0,c0))                  
        out=hn[-1]                                 
        return self.classifier(out)