LSTM autoencoder implementation

Seungyoung_Park · July 26, 2017, 8:40am

I am implementing LSTM autoencoder which is similar to the paper by Srivastava et. al (‘Unsupervised Learning of Video Representations using LSTMs’).

In the above figure, the weights in the LSTM encoder is copied to those of the LSTM decoder.

To implement this, is the encoder weights cloned to the decoder ?

More specifically, is the snippet blow correct ?

class Sequence(nn.Module):
def __init__(self):
    super(Sequence, self).__init__()
    self.lstm_enc = nn.LSTMCell(1, hidden_size)
    self.fc_enc = nn.Linear(hidden_size,1)
    self.lstm_dec = nn.LSTMCell(1, hidden_size)
    self.fc_dec = nn.Linear(hidden_size,1)

def forward(self, input, input_r):
    outputs = []
    
    h_t_enc = Variable(torch.zeros(input.size(0), hidden_size).cuda(), requires_grad=False)
    c_t_enc = Variable(torch.zeros(input.size(0), hidden_size).cuda(), requires_grad=False)

    # enc
    for i, input_t in enumerate(input.chunk(input.size(1), dim=1)):
        h_t_enc, c_t_enc = self.lstm_enc(input_t, (h_t_enc, c_t_enc))
        output = self.fc_enc(c_t3_enc)
    
    #dec
    h_t_dec = h_t_enc.clone()
    c_t_dec = c_t_enc.clone()
    
    outputs += [output]

    # note that input_r is the time-reverse version of input
    for i, input_t in enumerate(input_r.chunk(input_r.size(1), dim=1)):
        if i != input_r.size(1)-1:
            h_t_dec, c_t_dec = self.lstm1_dec(input_t, (h_t_dec, c_t_dec))
            output = self.fc_dec(c_t_dec)
            outputs += [output]
    
    outputs = torch.stack(outputs, 1).squeeze(2)

    return outputs

alexis-jacq · July 26, 2017, 9:03am

Do you mean the hidden activations?

Do you really have to clone them? It seems to me that you could directly pass your outputs to the decoder.

Seungyoung_Park · July 26, 2017, 9:06am

Yes, I mean the hidden activations.

Do you mean

h_t_dec = h_t_enc
c_t_dec = c_t_enc

?

alexis-jacq · July 26, 2017, 9:15am

Yes, or even directly:

h_t_dec, c_t_dec = self.lstm1_dec(input_t, (h_t_enc, c_t_enc))

Seungyoung_Park · July 26, 2017, 9:20am

Thanks for your reply.

But, it seems to me that W1 at the encoder and W2 at the decoder is different.
If it is not, why do the authors use the different notation ?

alexis-jacq · July 26, 2017, 9:30am

I think in the paper W1 and W2 represent the operations applied on the (hidden, cell and inputs) activations, using the weights parameters of respectively LSTM_enc and LSTM_dec. These parameters are differents.

Seungyoung_Park · July 26, 2017, 9:42am

Do you mean that I should use

h_t_dec = h_t_enc
c_t_dec = c_t_enc

even if these parameters are different ?

dgriff · July 26, 2017, 10:21am

Yes this example could be interpreted as in auto encoder. W1 or in this example C_t is passed through lstm1 and W2 or in this example C_t2 is passed through lstm2 through timesteps.

How you want to set this up though depends on what type of data your looking to use autoencoderwith model.

gohar94 · January 13, 2018, 7:18am

@Seungyoung_Park Hey! Did you get a chance to finish the implementation for this? Do you mind sharing the source code for that? I’m also working on a similar problem. Thanks a lot.