Adding new hidden layer to LSTM

so,there is one set of weight for both layers(LSTM instances) as a whole,but each layer(LSTM instance) have access only to one corresponding different part of it while training or inferring,right?

There in your code there is only one LSTM instance (self.lstm) and since you use it twice it uses exactly the same weights both times.

1 Like

I got one more question.
I add a couple strings to the code(it works without errors) and got modules like this:

LSTMTagger(
  (word_embeddings): Embedding(9, 5)
  (lstm): LSTM(5, 9)
  (lstm2): LSTM(9, 15)
  (hidden2tag): Linear(in_features=15, out_features=3)
)
Embedding(9, 5)
LSTM(5, 9)
LSTM(9, 15)
Linear(in_features=15, out_features=3)

But befor i got like:

LSTMTagger(
  (word_embeddings): Embedding(9, 6)
  (lstm): LSTM(6, 9)
  (lstm2): LSTM(6, 9)
  (hidden2tag): Linear(in_features=9, out_features=3)
)
Embedding(9, 6)
LSTM(6, 9)
LSTM(6, 9)
Linear(in_features=9, out_features=3)

So as you can see here we got one lstm module, which output is 6, is fed to another which input is 9,and it still works.
I cant get why this example throws no errors and work well despite different input and output dimensions?
It looks a bit fishy for me.
Perhaps,it takes only first 6 values as input of second lstm model but i’m not sure/

  1. The second layer should have input_size = hidden_size_of_first_layer.
  2. The printed description of the model corresponds to the order you declare the layers in __init__ but forward can use the layers in different order. That is why the printed description can sometimes be incoherent even when the model still works.