Hello, everyone. I use deeplab-v2-resnet model for image segmentation. But due to the small batch size when training, I want to ‘freeze’ the parameters of BN layers which are loaded from pretrained model. I implement ‘frozen’ BN as follows:
When training, I set momentum = 0 for all nn.BatchNorm2d, so I think the running mean and running var will keep still. Then I set requires_grad of parameters() of nn.BatchNorm2d false. so I think weight (gamma) and bias (beta) will keep still. I also add the following codes to further check the correctness, I save the parameters of BN layers in first step to critemp, then I check whether the parameters of BN layers of each step temp are unchanged.
temp = []
critemp = torch.load("bn_para.pt")
def frozen_fn(m):
classname = m.__class__.__name__
if classname.find('BatchNorm2d') != -1:
temp.append([np.average(m.running_mean.data.cpu().numpy()),
np.average(m.running_var.data.cpu().numpy()),
np.average(m.weight.data.cpu().numpy()),
np.average(m.bias.data.cpu().numpy())])
model.apply(frozen_fn)
assert temp == critemp
And when testing, I directly use model.eval() and I also ensure that the parameters of BN layers are the same as those when training. But the results are quite terrible. And I change the mode from eval to train. And the results turn out to be much better. However, I think that both parameters of train and eval should be the totally same (I don’t use drop out). But why I still get the different performance??
Can anyone give me a help? Thanks!