BatchNorm with requires_grad=False and training=True in Validation

Ir1d · October 23, 2019, 6:15am

Hi,
In validation process we set model.eval() to tell BN and Dropouts that we are in val mode.
But what’s the difference if we use train mode?
In the docs it said:

During training, this layer keeps a running estimate of its computed mean and variance. The running sum is kept with a default momentum of 0.1.

During evaluation, this running mean/variance is used for normalization.

what if I set requires_grad=False for the whole model, which means that the running mean/var will not be modified.
In that case, is training mode same as validation mode?
Thanks in advance.

upd: This question came to me when using vgg16 as perceptual loss. Do we need to set vgg16 model to eval mode?

ptrblck · October 23, 2019, 10:28am

requires_grad does not change the train/eval mode, but will avoid calculating the gradients for the affine parameters (weight and bias).
bn.train() and bn.eval() will change the usage of the running stats (running_mean and running_var).

Ir1d · October 23, 2019, 10:30am

the running mean/var are parameters, right?
If requires_grad is set to False, the params shouldn’t be changed. Does that mean when requires_grad=False, the running stats will not be changed, so .train() and .eval() are the same?

ptrblck · October 23, 2019, 10:31am

No, running_mean and running_var are buffers and thus do not require gradients, but will be updated (in train mode) using the current batch statistics and the momentum.