As I feel like there is very little help to this in the documentation or somewhere else I am just trying to figuere out if I did everything correctly. At first I only changed bidirectional to True and to my suprise my model did still run without any error message. Now I also changed the dimensions of the next linear layer and my forward method.
Here is my thought process:
I am thinking, since bidirection doubles my lstm layer I need to double the Input size of the next layer. Which is weird because, like I said, after changing bidirectional to True without changing the input size of the next layer my model still ran.
I need to take both outputs of my lstm hn and hn[-1] and bot feed them to the next layer. So I then thought, maybe my model did not give me an error since I didnt have this implemented yet.
Would be just nice to know for sure how to correctly change the lstm structure. Any help appreciated.
EDIT: I just realised that ptorch documantation states
My LSTM Class now looks like this.
class Model_LSTM(nn.Module): def __init__(self, n_features, n_classes, n_hidden, n_layers): # bidirectional möglich super().__init__() self.lstm = nn.LSTM( input_size=n_features, hidden_size=n_hidden, num_layers=n_layers, batch_first=True, dropout=0, bidirectional=True ) self.dense1 = nn.Linear(n_hidden*2, n_hidden) self.classifier = nn.Linear(n_hidden, n_classes) torch.nn.init.xavier_uniform_(self.lstm.weight_ih_l0) torch.nn.init.xavier_uniform_(self.lstm.weight_hh_l0) torch.nn.init.xavier_uniform_(self.classifier.weight) torch.nn.init.xavier_uniform_(self.dense1.weight) def forward(self, x): _, (hn, _) = self.lstm(x) out=torch.concat([hn, hn[-1]], dim=1) out = F.relu(self.dense1(out)) return self.classifier(out)