Do I need to reset my Hidden and Cell State of my LSTM/GRU when training with batch size >1?

Patrick1 · February 5, 2022, 3:16pm

I have timeseries Data that I want to classify. In my Training Data I have 3 different Classes. Each class has about 50 examples. Each example is a timeseries constisting of 190 timesteps which is about 1.9 seconds .

Summarized:
3 Classes
Each Class has about 50 examples
Examples consist of 190 timesteps

What I did in the beginning was to train with batch size = 1. Now after my model is working and training I want to try different batch sizes. My professor told me I need to reset my hidden state and cell state when training with batch size >1. So after looking through the internet I came up with my following code but I am wondering if this is right. Constently setting my hidden state to 0 inside my forward function makes me think my model should not be able to learn

Here is my Class

class Model_LSTM(nn.Module):

    def __init__(self, n_features, n_classes, n_hidden, n_layers):            

        super().__init__()

        self.lstm = nn.LSTM(

            input_size=n_features,

            hidden_size=n_hidden,

            num_layers=n_layers,

            batch_first=True,

            dropout=0.75,

            bidirectional=True,

        )

        self.linear = nn.Linear()

        # HE-Initialisierung

        weight = torch.zeros(n_layers,n_hidden)

        nn.init.kaiming_uniform_(weight)

        self.weight = nn.Parameter(weight)

        self.classifier = nn.Linear(n_hidden, n_classes)


    def init_hidden(self):      

        hidden_state = torch.zeros(self.lstm.num_layers,batch_size,self.lstm.hidden_size)

        cell_state = torch.zeros(self.lstm.num_layers,batch_size,self.lstm.hidden_size)

        return (hidden_state, cell_state)

    def forward(self, x):

        self.hidden = self.init_hidden()

        _, (hidden, _) = self.lstm(x)                  

        out=hidden[-1]                                  

        return self.classifier(out)