Transfer running_mean/std of a trained model to an another model

Yilin_Liu · February 17, 2019, 9:00pm

Hi, I want to substitute the running_mean/variance of a new model with the running_mean/variance of a pre-trained model. Is there a simple way to do that? Thanks!

ptrblck · February 18, 2019, 1:08am

You could just assign the running estimates to your new BatchNorm layers:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 3, 1, 1)
        self.bn1 = nn.BatchNorm2d(6)
        
    def forward(self, x):
        x = self.bn1(self.conv1(x))
        return x

modelA = MyModel()
# Update BN running estimates
for _ in range(10):
    modelA(torch.randn(10, 3, 24, 24))

print(modelA.bn1.running_mean)
print(modelA.bn1.running_var)

modelB = MyModel()

for childA, childB in zip(modelA.children(), modelB.children()):
    if isinstance(childA, nn.BatchNorm2d):
        childB.running_mean = childA.running_mean
        childB.running_var = childA.running_var

print(modelB.bn1.running_mean)
print(modelB.bn1.running_var)

Based on the architectures assignment might be a bit trickier and you might want to select the appropriate layers manually, e.g.:

modelB.classifier.bn1.running_mean = modelA.features.bn17.running_mean
...