Batch Normalization objects

shrutishrestha · April 14, 2021, 6:43am

Suppose we initialized batchnorm:
bn1=nn.BatchNorm2d(64,track_running_stats=False)

Can we use this bn1 after every convolution layer that outputs 64 channels?

or do we need to do:
bn1=nn.BatchNorm2d(64,track_running_stats=False)
bn2=nn.BatchNorm2d(64,track_running_stats=False)
for using after 2 conv layers with 64 layers output each?

Do batch norm track weights i.e. running mean and variance from previous states?

ptrblck · April 14, 2021, 7:04am

You could reuse the same batchnorm layer, but this would indeed reuse the same affine parameters as well as update the running stats with each input, so it’s most likely not what you want.
The standard approach would be to create a new batchnorm layer.

shrutishrestha · April 14, 2021, 7:09am

Ok. Thank you @ptrblck

shrutishrestha · April 14, 2021, 7:29am

@ptrblck is this the same for conv blocks too? Could we use one conv3x3 object for different layers or we should use different conv3x3 layers at different conv layers?

ptrblck · April 14, 2021, 10:43pm

The same applied for all layers. Usually you would recreate them and use separate layers with their individual trainable parameters. However, there are of course use cases, where you explicitly want to reuse the parameters, in which case you could create a module and use it several times in the forward pass.