Momentum in BN and eval mode

When i set the module into eval mode, what is the BN layer’s use?
Q1:The doc says it use gamma and beta to change the input, but where is it from?
Q2:What dose the momentun parm do in the BN?

Gamma and beta are trainable parameters. Have a look at the original paper. It’s used to allow the layer to “learn” another normalization or even recover the original activations.
You can disable these parameters by setting affine=False in the BatchNorm layer.

During evaluation, the running_mean and running_var together with gamma and beta are used to normalize the data. The running stats were calculated during the training from the batch statistics. The momentum defines, how much of the current statistics (mean and var) are added to the history of these values.
The doc says:

x̂ new=(1−momentum)×x̂ +momemtum×xt, where x̂ is the estimated statistic and xt is the new observed value.

Let me know, if something is still unclear.

It might take me some time to review the paper :grin:
Thanks a lot!
So the momentum is used to optimize the two learnable param beta and gamma, am i right?:thinking:

No the momentum is just used for the weighted average of the running statistics.
Beta and Gamma are optimized like “normal” weights and biases, i.e. using the learning rate and gradients.

So the momentum is used when we update the mean and variance that we will used in eval mode ,i.e. in each iteration using the mini batch ,we use it to got new value from old mean and var of last iteration and the new mini batch’s internal static

Yes, exactly. You have to be a bit careful with the meaning of momentum, since it weights the currently observed value. It’s default value is thus 0.1. You will find the opposite definition in other frameworks, with a default value of 0.9.

Thanks a lot!!!
Wish you a good day