Requires grad on custom modules

sftwre · April 6, 2021, 8:27pm

I am performing transfer learning on a pre-trained version of MobileNet V2. I want to freeze all
layers of the model except for BatchNorm2d modules. I am currently handling this with the following loop:

    for module in model.modules():
        if not isinstance(module, nn.modules.BatchNorm2d):
            module.requires_grad_(False)
        else:
            module.requires_grad_(True)

This will iterate through the modules and turn off gradient computation for all layers except the BatchNorm2d layers. MobileNet V2 has several custom modules, however. For example, the ConvBNReLU block is used to group a Conv2d, BatchNorm2d, and ReLU6. Using this method gradient computation will first be turned off on the ConvBNReLU module, but when iterating through it’s components the BatchNorm2d layer will have gradient computation turned on. My questions are: is this the correct way to freeze non-BatchNorm2d layers in custom modules and will this method prevent BatchNorm2d layers within ConvBNReLU modules from being updated during training ?

ptrblck · April 7, 2021, 8:37am

If I’m not mistaken, the recursive loop would enable the gradient computation for all nested BatchNorm2d layers even in the custom modules, as they would be reached last.
You could verify it manually after using this approach via e.g.:

print(model.custom_layer.bn_layer.weight.requires_grad)