Reducing LSTM Hidden State Output to 1 Dimension

ritchieng · February 5, 2017, 1:29pm

Hi, this is the example given in the document (I modified the numbers just to show a simple example of 1 in and 1 out LSTM):

>>> rnn = nn.LSTM(1, 100, 4)
>>> input = Variable(torch.randn(1, 1, 1))
>>> h0 = Variable(torch.randn(4, 1, 100))
>>> c0 = Variable(torch.randn(4, 1, 100))
>>> output, hn = rnn(input, (h0, c0))

This gives an output dimension of [1, 1, 100].

How do I reduce it to be of a size [1, 1, 1] (basic 1 in 1 out LSTM)? I tried a linear layer and it didn’t work (loss not decreasing properly). I could only make a simple LSTM work with 1 hidden state at the moment.

Any ideas?

apaszke · February 5, 2017, 4:10pm

Not sure what 1 in 1 out LSTM is, you want to have 1 of input and output features, while using 100 for hidden size? I think the standard practice is to add a Linear layer on top of the LSTM. If the network learns with hidden states of size 1, but doesn’t with that of size 100, it might mean that the optimization task becomes too complex for a simple task that you’re training the network to perform. Can’t really help you a lot, it depends on a lot of factors.