I intend to implement an LSTM in Pytorch with multiple memory cell blocks - or multiple LSTM units, an LSTM unit being the set of a memory block and its gates - per layer, but it seems that the base class torch.nn.LSTM
enables only to implement a multi-layer LSTM with one LSTM unit per layer:
lstm = torch.nn.LSTM(input_size, hidden_size, num_layers)
where (from the Pytorch’s documentation):
-
input_size
is the input dimension of the network, -
hidden_size
is the hidden state dimension for every layer (i.e. the dimension of every layer), -
num_layer
is the number of layers of the network
Thereupon, from above, each LSTM unit has exactly one cell (the cell state for each LSTM unit is thus a scalar) because for each layer the dimension of the cell state corresponds to the dimension of the hidden state (i.e. hidden_size
).
However in the original LSTM model proposed by Hochreiter and Schmidhuber
[1997], each LSTM block/unit can contains several cells:
LSTM Network [Hochreiter, 1997]
Is there a way to do so?