How to freeze the network in some layers, in fully connected network?

Mari · August 31, 2021, 11:00pm

Hi,
How can I freeze network in initial layers except two last layers? Do I need to implement filter in optimizer for those fixed layers? if yes how?

class Net_k(nn.Module):

    #The __init__ function stack the layers of the 
    #network Sequentially 
    def __init__(self):
        super(Net_k, self).__init__()
        self.main = nn.Sequential(
            nn.Linear(input_n,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(), 
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,h_nk),
            Swish(),
            nn.Linear(h_nk,1),
            Swish(),
            
        )
   
    def forward(self,x):
        output = self.main(x)
        return  output

optimizer_k = optim.Adam(net_k.parameters(), lr=learning_rate, betas = (0.9,0.99),eps = 10**-15)

ptrblck · September 1, 2021, 5:31am

You can set the .requires_grad attribute of the parameters of the first layers to False to freeze them.

You don’t necessarily need to filter out frozen parameters, as they won’t be updated, but could do it via e.g torch.optim.SG(filter(lambda p: p.requires_grad, model.parameters()), lr=...).

Mari · September 1, 2021, 4:38pm

Thanks ptrblck.
Can you please provide me an example of how I can use requires_grad to freeze a layer?

ptrblck · September 1, 2021, 7:18pm

Sure, this code snippet freezes the parameters of the first layer:

model = models.resnet18()
# freeze parameters of first layer
for param in model.conv1.parameters():
    param.requires_grad = False

Mari · September 1, 2021, 10:29pm

If I want to freeze all layers except the last one is this correct to write:

for parameter in Net_k.parameters():
parameter.requires_grad = False
for parameter in Net_k[-1].parameters():
parameter.requires_grad = True

ptrblck · September 1, 2021, 11:17pm

I would, if Net_k[-1] returns the last layer, e.g. if it’s defined as an nn.Sequential module. However, if you are using a custom nn.Module, you would need to access the last layer directly via model.last_layer_name.parameters(),

Mari · September 1, 2021, 11:37pm

The network I defined is an nn.Sequential module like above. Then I think this is correct.