Trying to understand the meaning of model.train() and model.eval()

isalirezag · June 23, 2018, 10:20pm

Hi

So i see in the main.py (https://github.com/pytorch/examples/blob/master/imagenet/main.py) we have model.train() and model.val(), i dont understand how to use them. can someone explain it to me please.
For example in here:
python main.py -a resnet18 [imagenet-folder with train and val folders] we did not specify train or eval, so how do we know which one to use.
I know my question is stupid, please let me know if there is any good tutorial to read and understand it.

Thanks

anis016 · June 23, 2018, 11:51pm

maybe these should clear you out.

By default all the modules are initialized to train mode (self.training = True). Also be aware that some layers have different behavior during train/and evaluation (like BatchNorm, Dropout) so setting it matters.

Dropout works as a regularization for preventing overfitting during training.
It randomly zeros the elements of inputs in Dropout layer on forward call.
It should be disabled during testing ( model.eval() ) since you may want to use full model (no element is masked)