Why set batchnorm in eval mode during training?

dawn · March 8, 2019, 1:32am

here is part of code of mask rcnn found in github, in the predict function，the batchnorm layer is set to eval mode in the training process, but why?
as I know, batchnorm in the eval mode will not update running mean and running variance, in this implementation, batchnorm is always set to eval mode in training process and test process, so is that mean the running mean and running variance is always the initial value, whats the initial value? and what about gamma and beta?

def predict(self, input, mode):
    molded_images = input[0]
    image_metas = input[1]

    if mode == 'inference':
        self.eval()
    elif mode == 'training':
        self.train()

        # Set batchnorm always in eval mode during training
        def set_bn_eval(m):
            classname = m.__class__.__name__
            if classname.find('BatchNorm') != -1:
                m.eval()

        self.apply(set_bn_eval)

Oli · March 8, 2019, 7:42am

Huh, cool find. I’m also curious. Link to code.

Edit: Ok so I’m still not sure but one reason to freeze the batch norm is when the batch size is very low, like e.g. 1. I did try mask-rcnn at some point and remember that it was very GPU-memory intensive. I’m guessing they load the weights for their backbone and that batchnorm is already tuned to another dataset so that’s better than no batch norm? Again, speculation