I have observed that batch normalization parameters such as running mean and running stats get update after we do the forward pass through the model ( as in just after we do output=model(input) ).
So when we train the model and do the evaluation after every epoch is it recommended to put the model in eval mode and then do the inference or not?
When using batchnorm in training mode, the running stats are always updated yes.
You should be using the eval mode to use these stats and stop updating them when evaluating indeed.
So if I want to update some other parameters using my evaluation loss (on validation set) I should first calculate the validation loss with model in eval mode so that running mean and vars are not updated and then shift to train mode to update the additional parameters right?
But if we update these statistics during evaluation we will be kind of providing the information of validation set to train set which would not be recommended I believe.
@albanD a quick question related to running mean var while the model is in train mode. So if my model is in train mode, does pytorch uses the running mean and var only of the current mini batch or does it calculates running mean and var based on previous batches and the current mini batch as well?
But the short version is that during training, it computes the stats on the current batch and use that to compute the output and update the running stats.
During evaluation, it uses the saved running stats to compute the output.