Finetuning pretrained networks with batchnorm

cslxiao · June 19, 2017, 2:12am

I wish to finetune pretrained networks with batchnorm layer for fully convolutional networks. Because of the limit of GPU memory, we often use batchsize=1 when training FCN. So I wish to freeze the batchnorm parameters (including BN weight and bias, and the running mean and variance). I set the value of requires_grad as False for all the BN layers.
However, when I start training the FCN, the running mean and variance change after one iteration. So what should I do to freeze the running mean and variance during finetuning?

smth · June 22, 2017, 5:08am

you need to set those layers to eval mode with layer.eval() where layer is a BatchNorm layer.

eval mode will make sure that you use the running_mean and running_var instead of using batch statistics.

Hengck · July 28, 2017, 1:35am

what happens i just want to freeze the var and mean, but learn the bias and shift of the bn layer. this is common in training objection detection using retrained resnet on imagenet. how can i do this?

maunz · February 5, 2018, 9:53am

Did you manage to solve this?
I’m trying to do the same thing but setting the batch-norm layers of my network to .eval() during training just gives me NaN loss.