xzyx
January 8, 2018, 7:08am
#1
A very basic question: How could I implement my own LSTM by modifying the existing implementation? I read the source code but couldn’t find where the network structures are really implemented …

In the RNNBase module, it seems that this two lines in forward() really run implement the computation?

func = self._backend.RNN(
self.mode,
self.input_size,
self.hidden_size,
num_layers=self.num_layers,
batch_first=self.batch_first,
dropout=self.dropout,
train=self.training,
bidirectional=self.bidirectional,
batch_sizes=batch_sizes,
dropout_state=self.dropout_state,
flat_weight=flat_weight
)
output, hidden = func(input, self.all_weights, hx)

but then I don’t know what self._backend is… I tried to trace back through the codes and then I got a bit lost.

3 Likes

alishir
(Ali Shirvani)
January 8, 2018, 7:11am
#2
You want to implement custom LSTM cell or custom LSTM network?

1 Like

xzyx
January 8, 2018, 7:28am
#3
I want to customize the LSTM cell.

Thanks!

alishir
(Ali Shirvani)
January 8, 2018, 7:44am
#4
Hmm, sorry, I don’t know how LSTM cell could be customized.

jpeg729
(jpeg729)
January 8, 2018, 8:39am
#5
The LSTM class is implemented in C so it is hard to find and harder to customise. The LSTMCell class is implemented in python here , and the actual details of the calculation are implemented in python here .

Those links are for PyTorch v0.3.0. I assume you know how to find the corresponding master branch should you need to.

1 Like

xzyx
January 8, 2018, 5:55pm
#6
Is it true that the forward function in the StackedRNN in the second link would be the python version of multi-layer LSTM/RNN?

I guess I’m mostly looking for a python only mulit-layer RNN/LSTM as reference so that I don’t have to completely start from scratch.

Thanks!

dgriff
January 8, 2018, 7:54pm
#7
You can use this:

```
class LSTM(nn.Module):
"""
An implementation of Hochreiter & Schmidhuber:
'Long-Short Term Memory'
http://www.bioinf.jku.at/publications/older/2604.pdf
Special args:
dropout_method: one of
* pytorch: default dropout implementation
* gal: uses GalLSTM's dropout
* moon: uses MoonLSTM's dropout
* semeniuta: uses SemeniutaLSTM's dropout
"""
def __init__(self, input_size, hidden_size, bias=True, dropout=0.0, dropout_method='pytorch'):
super(LSTM, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.bias = bias
self.dropout = dropout
self.i2h = nn.Linear(input_size, 4 * hidden_size, bias=bias)
self.h2h = nn.Linear(hidden_size, 4 * hidden_size, bias=bias)
self.reset_parameters()
assert(dropout_method.lower() in ['pytorch', 'gal', 'moon', 'semeniuta'])
self.dropout_method = dropout_method
def sample_mask(self):
keep = 1.0 - self.dropout
self.mask = V(th.bernoulli(T(1, self.hidden_size).fill_(keep)))
def reset_parameters(self):
std = 1.0 / math.sqrt(self.hidden_size)
for w in self.parameters():
w.data.uniform_(-std, std)
def forward(self, x, hidden):
do_dropout = self.training and self.dropout > 0.0
h, c = hidden
h = h.view(h.size(1), -1)
c = c.view(c.size(1), -1)
x = x.view(x.size(1), -1)
# Linear mappings
preact = self.i2h(x) + self.h2h(h)
# activations
gates = preact[:, :3 * self.hidden_size].sigmoid()
g_t = preact[:, 3 * self.hidden_size:].tanh()
i_t = gates[:, :self.hidden_size]
f_t = gates[:, self.hidden_size:2 * self.hidden_size]
o_t = gates[:, -self.hidden_size:]
# cell computations
if do_dropout and self.dropout_method == 'semeniuta':
g_t = F.dropout(g_t, p=self.dropout, training=self.training)
c_t = th.mul(c, f_t) + th.mul(i_t, g_t)
if do_dropout and self.dropout_method == 'moon':
c_t.data.set_(th.mul(c_t, self.mask).data)
c_t.data *= 1.0/(1.0 - self.dropout)
h_t = th.mul(o_t, c_t.tanh())
# Reshape for compatibility
if do_dropout:
if self.dropout_method == 'pytorch':
F.dropout(h_t, p=self.dropout, training=self.training, inplace=True)
if self.dropout_method == 'gal':
h_t.data.set_(th.mul(h_t, self.mask).data)
h_t.data *= 1.0/(1.0 - self.dropout)
h_t = h_t.view(1, h_t.size(0), -1)
c_t = c_t.view(1, c_t.size(0), -1)
return h_t, (h_t, c_t)
```

2 Likes

xzyx
January 8, 2018, 9:09pm
#8
I am confused … I guess you just have an LSTM cell here? Do you have a wrap up of this into a multi-layer LSTM?

(Also I am not sure why you have batch-normalization?)

Thanks!

dgriff
January 8, 2018, 10:46pm
#9
edited above to pytorch version found here https://github.com/pytorch/benchmark/blob/master/benchmarks/lstm_variants/lstm.py

The LSTM layer version could not find but believe its specialized for efficiency so hard to modify. Usually best to customize cell version for custom stuff. Can modify to make it a layer version if you need. Hope its helpful sorry couldn’t find exactly what you are looking for

1 Like

Hi, I tried to implement your LSTM template for coding a custom cell but I’m getting tensor size mismatch errors. Here is my code: Size mismatch error when using custom LSTM cell . Did you encounter anything like this? Any suggestions?

1 Like

could not access the link can you provide it
thanks

Have you implemented a modified LSTMCell?

STU
(CODE)
August 13, 2020, 1:24pm
#13
hey, I read some blogs but I still can’t find out how to modify the LSTMCell especially when trying use Bi-directional-LSTM, have you know how to deal with that?