Weird behaviors when calling model.train() and model.eval() alternatively

antran96 · August 23, 2020, 11:43am

Hi,

I noticed that my model yields poor results when trying to do this:

for epoch in num_epochs:
   model.train()
   # training scripts
   model.eval()
   #evaluate scripts

However, this script works perfectly fine:

for epoch in num_epochs:
   model.train()
   # training scripts

model.eval()
#evaluate scripts

So in the 2nd script, I wait for my model to finish training and evaluate, and in the 1st one, I evaluate in every epoch.

Why does the 1st method result in worse results?

Thanks

AladdinPerzon · August 23, 2020, 8:43pm

I don’t think there should any difference running evaluation inside the training loop as opposed to running it afterward expect for the obvious that the model has trained longer if you run it after. Could you expand a little on what weird behaviors you encounter? Could you give some example code (preferably not too large) that produces this behavior, that would help to debug