Loading & Freezing a Pretrained Model to Combine with a New Network

CCL · March 9, 2020, 4:50am

I have a pretrained model and would like to build a classifier on top of it. I’m trying to load and freeze the weights of the pretrained model, and pass its outputs to the new classifier, which I’d like to optimise. Here is what I have so far, I’m a little stuck on a TypeError: forward() missing 1 required positional argument: 'x' error from the nn.Sequential line:

import model #model.py contains the architecture of the pretrained model

class Classifier(nn.Module):
    def __init__(self):
        ...
    def forward(self, x):
        ...

net = model.Model()
net.load_state_dict(checkpoint["net"])

for c in net.children():
    for param in child.parameters():
        params.requires_grad = False

model = nn.Sequential(nn.ModuleList(net()), Classifier())

ptrblck · March 9, 2020, 6:01am

I assume, the error is thrown in nn.ModuleList(net()), as you are trying to call net() without any inputs.
Note that net(input) will return the output of the forward pass, so wrapping it in an nn.ModuleList won’t work.
Try to pass both models directly to nn.Sequential.

CCL · March 9, 2020, 6:23am

I’m unfamiliar with how weights are loaded into a model in Pytorch behind the scenes. Could I still load & freeze the state_dict of the model without instantiating it (ie. directly passing the class)?

ptrblck · March 9, 2020, 6:45am

You would need to create an instance of your class, which will create the parameters, which you could freeze thereafter.
The state_dict contains tensors mapped to the module names from its model, which do not require gradients by default and only store the values.

CCL · March 9, 2020, 10:00am

I think I’m misunderstanding something here… If I do the following, will I still be able to use the pretrained parameters of model.Model()?

net = model.Model()
net.load_state_dict(checkpoint["net"])
for c in net.child ... #freeze

model = nn.Sequential(model.Model(), Classifier())

CCL · March 9, 2020, 10:06am

Ah I should be doing nn.Sequential(net, Classifier()) instead… Thanks for all your help!