I’m developing a BI-LSTM model for sequence analysis using PyTorch. For which I am using torch.nn.LSTM
. Using that module, you can have several layers with just passing a parameter num_layers
to be the number of layers (e.g., num_layers=2
). However all of them will have the same hidden_size
which is partially fine for me, I just want to have all of them the same hidden_size
but the last layer with a different size. Basic example follows:
rnn = nn.LSTM(input_size=10, hidden_size=20, num_layers=2)
inp = torch.randn(5, 3, 10)
h0 = torch.randn(2, 3, 20)
c0 = torch.randn(2, 3, 20)
output, (hn, cn) = rnn(inp, (h0, c0))
The output dim is ( 5, 3,
20
)
One solution (But unfavorable to me) is implementing extra model that outputs the dimension I need and takes the input from the first model, e.g.,:
rnn_two = nn.LSTM(input_size=20, hidden_size=2)
output2, _ = rnn_two(output)
Same as this solution. However, I do not wanna do this because I parallelize the model using DataParallel, so I need all to be one package. I was hoping to find something similar to keras, e.g.,:
rnn.add(LSTM, hidden_size=2)
I have checked the LSTM source code but couldn’t find what I need.
Any suggestions?