Model.train() vs model.eval() on pretrained model

Ashwin_Raju · September 13, 2018, 8:26pm

while using a pretrained model given by the author I evaluated with my code and I am able to get the same accuracy as they prescribe, but now when I train the model and then calculate the accuracy by setting it to model.train() and using the same evaluation dataset, I get less accuracy but not too worse . The only difference i made is setting to model.train() and commenting with torch.no_grad().
Note: I did not do optimizer.step() in train method. I just ran the same code as in evaluation script in my train script and just changed those two lines.

TheShadow29 · September 13, 2018, 8:33pm

Even if you don’t do optimizer.step, if the pretrained model has a batch norm layer, those values will get updated when you are evaluating. If the pre-training was done on a large data, best not to change. Also if the model has a dropout layer, the predictions will be a bit random. Hence it is better to evaluate only after model.eval()

Ashwin_Raju · September 13, 2018, 8:39pm

Sounds reasonable. Yes the pretrained model has batch norm layer . This raises me a new question . If I want to train a new dataset with these pretrained model weights then do i need to freeze the batch norm layers after loading the pretrained model weights into my model ?

TheShadow29 · September 13, 2018, 9:16pm

Depends on the new dataset. If the new dataset has quite a lot of data, then unfreezing batch norm is helpful, else not. In practice, I haven’t noticed too large of a difference though.

Blankit · June 28, 2019, 8:49am

What’s the difference between model.train(False) and model.eval()?

ptrblck · June 28, 2019, 11:24am

There is no difference and .eval() will internally call .train(False) as shown in this line of code.