Hi everyone,
I have implemented a simple Many-to-One LSTM Encoder-Classifier.
The model takes a packed sequence as input (as my input data has variable length) and outputs the probabilities for the target classes. The input sequences are rather long (about 3000 data points).
I am running the training on a 16“ MacBook Pro (6-Core Into Core i7, AMD Radeon Pro 5300M 4 GB) but unfortunately it seems that the training is extremly slow (up to 45 minutes per epoch).
As I have never worked with recurrent neural networks (or LSTMs) before, it is hard for me to determine if this is a usual duration for the back propagation through time or if there is something wrong with my implementation.
Can anybody please help me?
Here is my model:
class MyModel(nn.Module):
def __init__(self, input_features, output_features, n_classes):
super(MyModel, self).__init__()
self.encoder = nn.LSTM(input_features, output_features, 1, batch_first=True, dropout=0, bidirectional=False)
self.cla = nn.Linear(output_features, n_classes)
def forward(self, x: torch.nn.utils.rnn.PackedSequence) -> Tuple:
z, _ = self.encoder(x)
z_unpacked, lens_unpacked = pad_packed_sequence(z, batch_first=True)
last_elements = z_unpacked[torch.arange(z_unpacked.shape[0]), lens_unpacked - 1]
y = self.cla(last_elements)
y= F.softmax(y, dim=1)
return y