Suppose we initialized batchnorm:
Can we use this bn1 after every convolution layer that outputs 64 channels?
or do we need to do:
for using after 2 conv layers with 64 layers output each?
Do batch norm track weights i.e. running mean and variance from previous states?
You could reuse the same batchnorm layer, but this would indeed reuse the same affine parameters as well as update the running stats with each input, so it’s most likely not what you want.
The standard approach would be to create a new batchnorm layer.
@ptrblck is this the same for conv blocks too? Could we use one conv3x3 object for different layers or we should use different conv3x3 layers at different conv layers?
The same applied for all layers. Usually you would recreate them and use separate layers with their individual trainable parameters. However, there are of course use cases, where you explicitly want to reuse the parameters, in which case you could create a module and use it several times in the