Random initialisation of densenet layers

What is the best way of randomly initialising but freezing the last layers of a densenet?

I have the following code, where I am using a pretrained model but unfreezing the last denseblock4 and norm5 blocks for fine-tuning:

model = models.densenet161(pretrained=True)
for param in model.parameters():
    param.requires_grad = False
submodules = model.features[-2:]  
for param in submodules.parameters():
    param.requires_grad = True

However, instead of training these submodule layers, I would like to randomly intialise them and freeze them. What would be the best way of doing this?

You could remove the second loop, which “unfreezes” the parameters and call .reset_parameters() on the desired modules instead:

for module in submodules.modules():
    if hasattr(module, "reset_parameters"):
        module.reset_parameters()

or use a custom weight initialization via submodules.apply.

1 Like

ok great, I think that’s exactly what I was looking for. Is there a way of checking if they are frozen/unfrozen?

Yes, you can check the .requires_grad of the corresponding parameters and verify it’s set to False:

for name, param in submodules.named_parameters():
    print("{}.requires_grad == {}".format(name, param.requires_grad))
1 Like

Let’s say I want to use batch normalisation stats provided by ImageNet and so not randomly initialise them. Would the following be appropriate when defining the model before running the training loop?:

with torch.no_grad():  # allows to re-initialize the parameters
    submodules = model.features[-2:] 
    for submodule in submodules.modules():
        if isinstance(submodule, torch.nn.Conv2d):
            # randomly re-initialize the weights
            torch.nn.init.kaiming_normal_(submodule.weight)
            if submodule.bias is not None:
                # reset the bias to zero
                torch.nn.init.zeros_(submodule.bias)
        elif isinstance(submodule, torch.nn.BatchNorm2d):
            torch.nn.init.ones_(submodule.weight)
            torch.nn.init.zeros_(submodule.bias)
            # also reset running mean and running_var
            torch.nn.init.zeros_(submodule.running_mean)
            torch.nn.init.ones_(submodule.running_var)

No, since you are resetting the batchnorm parameters and buffers to their original values while it seems you want to use the pretrained values or am I misunderstanding your use case?

Well, I’m trying to randomly initialise (or set) the last few conv layers whilst keeping them frozen. However, I am also considering keeping the batch norm layers frozen with the previous ImageNet values rather than also randomly generating those values. I’m just curious to see if this would have an impact during training.

Also, this has made me think, wen resetting the parameters, how would I keep them frozen. I’m not sure if the above solution does that :thinking:.

You are resetting the batchnorm parameters and buffers to their initial value as seen here and are not keeping the pretrained values.

Set their .requires_grad attribute to False as already explained.

1 Like

Ah ok, so would this be sufficient to keep the later modified layers “frozen” at a random initialisation rather than ImageNet values?:

    for param in model.parameters():
        param.requires_grad = False
    submodules = model.features[-2:]  
    for module in submodules.modules():
        if hasattr(module, "reset_parameters"):
            module.reset_parameters()
        
    num_ftrs = model.classifier.in_features
    model.classifier = torch.nn.Linear(num_ftrs,2)

No, you would need to freeze model.classifier after replacing it with a new linear layer:

for param in model.classifier.parameters():
    param.requires_grad = False

Ok, apologies I don’t think I made myself clear here.

I want to create a feature extractor and so only train the classifier layers, and freeze the preceding feature extractor layers. However, I would like to randomly initialise and freeze the layers in the last two blocks of the feature extractor and only keep ImageNet values for the layers before these blocks.

I’m just wondering if adding this code after freezing all layers, unfreezes the last two blocks:

for module in submodules.modules():
        if hasattr(module, "reset_parameters"):
            module.reset_parameters()

I’m assuming that despite resetting parameters, they are still frozen apart from the classifier.

Calling reset_parameters() will manipulate the parameters inplace and will thus not change any attributes. If you’ve frozen these parameters before, they should still be frozen.
However, you can use my code snippets to easily verify it by printing the .requires_grad attribute afterwards.

1 Like

Great, thank you for being patient with me!