A very basic question: How could I implement my own LSTM by modifying the existing implementation? I read the source code but couldn’t find where the network structures are really implemented …
In the RNNBase module, it seems that this two lines in forward() really run implement the computation?
func = self._backend.RNN(
output, hidden = func(input, self.all_weights, hx)
but then I don’t know what self._backend is… I tried to trace back through the codes and then I got a bit lost.
You want to implement custom LSTM cell or custom LSTM network?
I want to customize the LSTM cell.
Hmm, sorry, I don’t know how LSTM cell could be customized.
The LSTM class is implemented in C so it is hard to find and harder to customise. The LSTMCell class is implemented in python here, and the actual details of the calculation are implemented in python here.
Those links are for PyTorch v0.3.0. I assume you know how to find the corresponding master branch should you need to.
Is it true that the forward function in the StackedRNN in the second link would be the python version of multi-layer LSTM/RNN?
I guess I’m mostly looking for a python only mulit-layer RNN/LSTM as reference so that I don’t have to completely start from scratch.
You can use this:
An implementation of Hochreiter & Schmidhuber:
'Long-Short Term Memory'
dropout_method: one of
* pytorch: default dropout implementation
* gal: uses GalLSTM's dropout
* moon: uses MoonLSTM's dropout
* semeniuta: uses SemeniutaLSTM's dropout
def __init__(self, input_size, hidden_size, bias=True, dropout=0.0, dropout_method='pytorch'):
self.input_size = input_size
self.hidden_size = hidden_size
self.bias = bias
self.dropout = dropout
self.i2h = nn.Linear(input_size, 4 * hidden_size, bias=bias)
self.h2h = nn.Linear(hidden_size, 4 * hidden_size, bias=bias)
assert(dropout_method.lower() in ['pytorch', 'gal', 'moon', 'semeniuta'])
self.dropout_method = dropout_method
keep = 1.0 - self.dropout
self.mask = V(th.bernoulli(T(1, self.hidden_size).fill_(keep)))
std = 1.0 / math.sqrt(self.hidden_size)
for w in self.parameters():
def forward(self, x, hidden):
do_dropout = self.training and self.dropout > 0.0
h, c = hidden
h = h.view(h.size(1), -1)
c = c.view(c.size(1), -1)
x = x.view(x.size(1), -1)
# Linear mappings
preact = self.i2h(x) + self.h2h(h)
gates = preact[:, :3 * self.hidden_size].sigmoid()
g_t = preact[:, 3 * self.hidden_size:].tanh()
i_t = gates[:, :self.hidden_size]
f_t = gates[:, self.hidden_size:2 * self.hidden_size]
o_t = gates[:, -self.hidden_size:]
# cell computations
if do_dropout and self.dropout_method == 'semeniuta':
g_t = F.dropout(g_t, p=self.dropout, training=self.training)
c_t = th.mul(c, f_t) + th.mul(i_t, g_t)
if do_dropout and self.dropout_method == 'moon':
c_t.data *= 1.0/(1.0 - self.dropout)
h_t = th.mul(o_t, c_t.tanh())
# Reshape for compatibility
if self.dropout_method == 'pytorch':
F.dropout(h_t, p=self.dropout, training=self.training, inplace=True)
if self.dropout_method == 'gal':
h_t.data *= 1.0/(1.0 - self.dropout)
h_t = h_t.view(1, h_t.size(0), -1)
c_t = c_t.view(1, c_t.size(0), -1)
return h_t, (h_t, c_t)
I am confused … I guess you just have an LSTM cell here? Do you have a wrap up of this into a multi-layer LSTM?
(Also I am not sure why you have batch-normalization?)
edited above to pytorch version found here https://github.com/pytorch/benchmark/blob/master/benchmarks/lstm_variants/lstm.py
The LSTM layer version could not find but believe its specialized for efficiency so hard to modify. Usually best to customize cell version for custom stuff. Can modify to make it a layer version if you need. Hope its helpful sorry couldn’t find exactly what you are looking for
Hi, I tried to implement your LSTM template for coding a custom cell but I’m getting tensor size mismatch errors. Here is my code: Size mismatch error when using custom LSTM cell. Did you encounter anything like this? Any suggestions?
could not access the link can you provide it
Have you implemented a modified LSTMCell?
hey, I read some blogs but I still can’t find out how to modify the LSTMCell especially when trying use Bi-directional-LSTM, have you know how to deal with that?