I was wondering whether the statements
model.train() affect the pytorch LSTM module in any way.
As a default, following the theory of a seq2seq model, when the model is set to training mode, it should apply teacher forcing, while instead, it should not apply teacher forcing when the model is set to evaluation mode. If so, why it is not clearly specified in the docs? Is there anything obvious which I am not seeing?
Thanks a lot
Teacher forcing is not intrinsic to the model but how you use the model. So
train() has nothing to do with. Sure, if you make inferences, you should always set a model to
If you check the PyTorch Seq2Seq tutorial, teacher forcing is only a thing in the
train() method, but not in the
evaluate() method for making inferences. Here, you simply have no ground truth, so the prediction of the next word depends on the last predicted word.