How to train with frozen BatchNorm?

Hi! I think you can fix the mean and variance by setting the affine parameter to False.
What that does is fix learnable parameters like “gamma” (variance) and “alpha” (mean) to 1 and 0. Effectively, you’ll only be using the running mean and variance (that’s an exponentially weighted average to keep track of mean and variance) during test time. (is when you call eval ())

I too have a similar issue with the train/eval scenario that you’re talking about but no benchmark to compare it against (RL problems). Both give me results of different orders on the first iteration. I’m hoping it’s because the running mean is updated only through the first sample. If someone knows the answer to this, please enlighten me…