Affine and momentum of BatchNorm layer

In a specific application, I need to freeze running statistic calculation in BatchNorm layer in a part of my code, but I need to utilize “gamma(weight)” and “beta(bias)” of this layer in training with gradient forward\backward pass.
I have implemented this by building an extra BatchNorm layer with “affine” false and doing forward pass as:

base_BatchNorm = nn.BatchNorm1d(1200)
extra_BatchNorm = nn.BatchNorm1d(1200, affine=False)
x = extra_BatchNorm(x)*base_BatchNorm.weight+base_BatchNorm.bias

I am curiose if there is more clean way to implement this. I think this can be done by setting momentum=0? or by a combination of momentum=0 and affine=False for base_BatchNorm as:

base_BatchNorm.momentum = 0
base_BatchNorm.affine = false
x = base_BatchNorm(x)
base_BatchNorm.momentum = 0.1
base_BatchNorm.affine = true

Is it right?

You can call bn.eval() which will use the running stats and will not update them. The affine parameters will still be trained.

I don’t want to use running_stats in this part of my code. I want to apply zero-mean, unit_variance operation, updating affine parameters, but not updating running_stats. I think calling bn.eval() will utilize running_stats in forward pass of bachNorm layer. Is it right?

In that case you could use F.linear with the bn.weight and bn.bias during the forward pass.

You mean something like:

My issue is that, I want to apply this operation to other deep nets (resnet) which I can not directly change their forward pass methods as there are multiple use of bachNorm. Besides, I have to do this for every net and it is not possible. Can I apply this call:

for bachnorm layers without changing their code internally. Something like ‘model.apply’?

No, model.apply will recursively apply the passed method to all layers.

I think the cleanest way would be to write a custom module using your suggested behavior and replace all batchnorm layers in the model with your custom ones.

1 Like

Sounds good. Thank you.