Teacher Forcing for LSTM in model.eval() and model.train() mode

Hello,
I was wondering whether the statements model.eval() and model.train() affect the pytorch LSTM module in any way.
As a default, following the theory of a seq2seq model, when the model is set to training mode, it should apply teacher forcing, while instead, it should not apply teacher forcing when the model is set to evaluation mode. If so, why it is not clearly specified in the docs? Is there anything obvious which I am not seeing?

Thanks a lot

Teacher forcing is not intrinsic to the model but how you use the model. So eval() and train() has nothing to do with. Sure, if you make inferences, you should always set a model to eval().

If you check the PyTorch Seq2Seq tutorial, teacher forcing is only a thing in the train() method, but not in the evaluate() method for making inferences. Here, you simply have no ground truth, so the prediction of the next word depends on the last predicted word.

2 Likes