I have a network that consists of batch normalization (BN) layers and other layers (convolution, FC, dropout, etc) which is pretrained ResNet50 model.
I want the model not to be trained so I freezed the all layer with requires_grad=False,
but I find the BN layer still updating and the performance gradually dropped.
So, I used the below code to freeze the batch norm layer.
for module in model.modules():
# print(module)
if isinstance(module, nn.BatchNorm2d):
if hasattr(module, 'weight'):
module.weight.requires_grad_(False)
if hasattr(module, 'bias'):
module.bias.requires_grad_(False)
module.track_running_stats = False
# module.eval()
But I am confused about the difference between putting layer.eval() and module.track_running_stats=False.
If I want the BN layer not to be updated and performance not to be changed in inference, do I just need to make it module.weight.requires_grad(False) only? Or should I also stop track_running_stats
which is used in inference mode.
I don’t know what kind of error you are getting, but changing the track_running_stats attribute after the layer creation might be dangerous. It is working in my setup, but I don’t think there is a guarantee that changing this attribute would work in previous/next versions.
Putting module.eval() works and does not change the performance.
But the bn layer keep changning running_mean and running_var during training even though I freezed them with module.eval().
So I thought I should also turn the track_running_stats off to prevent the running_mean and running_var which is used in inference.