Teacher forcing for training and predicting with a LSTM

Hello everyone,

I am very new to pytorch, so sorry if it’s trivial but I’m having some issues.

I have longitudinal data and I would like to train a recurrent neural network (let’s say an LSTM) for a classification task. Given the nature of the data, I’m allowed to use the true labels from the past in order to predict the present (which is usually not the case, like for machine translation for instance). Thus, I would like to perform teacher forcing on both the training phase and the prediction phase.

Inspired by this tutorial, I have the code below:

import torch
import torch.nn as nn


class LSTMTagger(nn.Module):

    def __init__(self, n_hidden_features, n_features, n_classes):
        super(LSTMTagger, self).__init__()

        self.lstm = nn.LSTM(n_features, n_hidden_features)
        self.hidden2tag = nn.Linear(n_hidden_features, n_classes)

    def forward(self, sample):
        lstm_out, _ = self.lstm(sample.view(len(sample), 1, -1))
        class_space = self.hidden2tag(lstm_out.view(len(sample), -1))
        class_scores = F.log_softmax(class_space, dim=1)
        return class_scores

As it is, I don’t think that I perform teacher forcing at any moment since the forward method does not take as input the true labels. Thus, my question is: how can one perform teacher forcing for both training and prediction phases? Should I try to slightly change the torch.nn.LSTM code? Any advice would be really appreciated.

When you perform training, to use teacher forcing, just shift expected values by one position and feed it back.

When you predict, you should store the hidden states of lstm, and feed them back position by position. nn.LSTM support this.

This sample might be helpftul: https://github.com/pytorch/examples/tree/master/word_language_model

1 Like

Thank you, it is more clear right now.

@Konstantin_Solomatov where is teacher forcing used in that code? I could not find it.

This pytorch tutorial demonstrates it.