LSTM Autoencoders in pytorch

Hello everyone. I’m trying to implement a LSTM autoencoder using pytorch. I have a dataset consisted of around 200000 data instances and 120 features. I load my data from a csv file using numpy and then I convert it to the sequence format using the following function:

def sequencer_fw(nparray, seq_len):
  sequences = nparray.tolist()
  dataset = []
  for i in range(0, len(sequences), seq_len):
    dataset.append(torch.tensor(sequences[i:i + seq_len]))
  print(torch.stack(dataset).shape)      
  n_seq, seq_len, n_features = torch.stack(dataset).shape
  return dataset, seq_len, n_features

Here is my LSTM Autoencoder implementation:

import torch
import torch.nn as nn

device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

class Encoder(nn.Module):
  def __init__(self, seq_len, n_features, embedding_dim=32):
    super(Encoder, self).__init__()
    self.seq_len, self.n_features = seq_len, n_features
    self.embedding_dim, self.hidden_dim = embedding_dim, 2 * embedding_dim
    self.rnn1 = nn.LSTM(
      input_size=n_features,
      hidden_size=self.hidden_dim,
      num_layers=1,
      batch_first=True
    )
    self.rnn2 = nn.LSTM(
      input_size=self.hidden_dim,
      hidden_size=embedding_dim,
      num_layers=1,
      batch_first=True
    )
    
  def forward(self, x):
    x = x.reshape((1, self.seq_len, self.n_features))
    x, (_, _) = self.rnn1(x)
    x, (hidden_n, _) = self.rnn2(x)
    return hidden_n.reshape((self.n_features, self.embedding_dim))
    
    
class Decoder(nn.Module):
  def __init__(self, seq_len, n_features, input_dim=32):
    super(Decoder, self).__init__()
    self.seq_len, self.input_dim = seq_len, input_dim
    self.hidden_dim, self.n_features = 2 * input_dim, n_features
    self.rnn1 = nn.LSTM(
      input_size=input_dim,
      hidden_size=input_dim,
      num_layers=1,
      batch_first=True
    )
    self.rnn2 = nn.LSTM(
      input_size=input_dim,
      hidden_size=self.hidden_dim,
      num_layers=1,
      batch_first=True
    )
    self.output_layer = nn.Linear(self.hidden_dim, n_features)
    
  def forward(self, x):
    x = x.repeat(self.seq_len, self.n_features)
    x = x.reshape((self.n_features, self.seq_len, self.input_dim))
    x, (hidden_n, cell_n) = self.rnn1(x)
    x, (hidden_n, cell_n) = self.rnn2(x)
    x = x.reshape((self.seq_len, self.hidden_dim))
    return self.output_layer(x)


class LSTMAE(nn.Module):
  def __init__(self, seq_len, n_features, embedding_dim=32):
    super(LSTMAE, self).__init__()
    self.encoder = Encoder(seq_len, n_features, embedding_dim).to(device)
    self.decoder = Decoder(seq_len, n_features, embedding_dim).to(device)
    
  def forward(self, x):
    x = self.encoder(x)
    x = self.decoder(x)
    return x

The training procedure is done by the following code:

for epoch in range(epochs_num):
    model = model.train()
    train_losses = []
    for sequence in train:
      print(sequence)
      optimizer.zero_grad()
      sequence = sequence.to(device)
      seq_pred = model(sequence)
      loss = criterion(seq_pred, sequence)
      loss.backward()
      optimizer.step()
      train_losses.append(loss.item())
    val_losses = []
    model = model.eval()
    with torch.no_grad():
      for sequence in valid:
        sequence = sequence.to(device)
        seq_pred = model(sequence)
        loss = criterion(seq_pred, sequence)
        val_losses.append(loss.item())
    train_loss = np.mean(train_losses)
    val_loss = np.mean(val_losses)
    print("epoch : {}/{}, train_loss = {:.6f}, val_loss = {:.6f}".format(epoch + 1, epochs_num, train_loss, val_loss))

But when I start training my autoencoder I get the following error message at the end of the encoder:

    return hidden_n.reshape((self.n_features, self.embedding_dim))
RuntimeError: shape '[120, 32]' is invalid for input of size 32

Could you please point out where did I do wrong?
Many thanks in advance.

To be honest, I didn’t really check your code…to tight on time :). However, I have some (I hope) working implementations of autoencoders that might be worth to have a look.

I noticed an ample use if reshape() in your code, which might be perfectly correct. But it can be a cause of issues; see this post. Lastly, if your encoder and decoder are “symmetric” – that is, the same number of hidden dimensions and layers – you can directly copy the last hidden state, and there should be no need for repeat() and such things.

Hello, and thanks for this code. I was looking into LSTM autoencoders myself.

I think the problem is that the hiddden_n variable is not supposed to be (self.n_features x self.embedding_dim) to begin with. Isn’t the number of features in the vectors of the input sequence supposed to be replaced by the number of embedding dimensions? I am pretty sure that the correct approach would be to reshape it to just embedding_dims or something like that.

I hope this is helpful.