Hi,
In validation process we set model.eval() to tell BN and Dropouts that we are in val mode.
But what’s the difference if we use train mode?
In the docs it said:
During training, this layer keeps a running estimate of its computed mean and variance. The running sum is kept with a default momentum of 0.1.
During evaluation, this running mean/variance is used for normalization.
what if I set requires_grad=False for the whole model, which means that the running mean/var will not be modified.
In that case, is training mode same as validation mode?
Thanks in advance.
upd: This question came to me when using vgg16 as perceptual loss. Do we need to set vgg16 model to eval mode?
requires_grad does not change the train/eval mode, but will avoid calculating the gradients for the affine parameters (weight and bias). bn.train() and bn.eval() will change the usage of the running stats (running_mean and running_var).
the running mean/var are parameters, right?
If requires_grad is set to False, the params shouldn’t be changed. Does that mean when requires_grad=False, the running stats will not be changed, so .train() and .eval() are the same?
No, running_mean and running_var are buffers and thus do not require gradients, but will be updated (in train mode) using the current batch statistics and the momentum.