Adding new hidden layer to LSTM

Alex_white · January 24, 2018, 4:26pm

so,there is one set of weight for both layers(LSTM instances) as a whole,but each layer(LSTM instance) have access only to one corresponding different part of it while training or inferring,right?

jpeg729 · January 24, 2018, 4:56pm

There in your code there is only one LSTM instance (self.lstm) and since you use it twice it uses exactly the same weights both times.

Alex_white · January 25, 2018, 4:33pm

I got one more question.
I add a couple strings to the code(it works without errors) and got modules like this:

LSTMTagger(
  (word_embeddings): Embedding(9, 5)
  (lstm): LSTM(5, 9)
  (lstm2): LSTM(9, 15)
  (hidden2tag): Linear(in_features=15, out_features=3)
)
Embedding(9, 5)
LSTM(5, 9)
LSTM(9, 15)
Linear(in_features=15, out_features=3)

But befor i got like:

LSTMTagger(
  (word_embeddings): Embedding(9, 6)
  (lstm): LSTM(6, 9)
  (lstm2): LSTM(6, 9)
  (hidden2tag): Linear(in_features=9, out_features=3)
)
Embedding(9, 6)
LSTM(6, 9)
LSTM(6, 9)
Linear(in_features=9, out_features=3)

So as you can see here we got one lstm module, which output is 6, is fed to another which input is 9,and it still works.
I cant get why this example throws no errors and work well despite different input and output dimensions?
It looks a bit fishy for me.
Perhaps,it takes only first 6 values as input of second lstm model but i’m not sure/

jpeg729 · January 28, 2018, 12:17pm

The second layer should have input_size = hidden_size_of_first_layer.
The printed description of the model corresponds to the order you declare the layers in __init__ but forward can use the layers in different order. That is why the printed description can sometimes be incoherent even when the model still works.