How to partially load a model for transfer learning

coyote · August 14, 2018, 12:40pm

Hi everyone,

I have a (probably basic) question.

I have found a pytorch model on github which I want to use. Link. The model implemented on this repository is represented in the following figure:

I first train model with the dataset given in the repository and then I want to finetune it in my dataset which have different number of classes. Therefore, I want to load weights of all layers except the last fully connected layer. Is there a way to do that ? If I want to load the complete model it is done as follows:


checkpoint = torch.load(args.model_path)

model.load_state_dict(checkpoint['state_dict'])

Model architecture as follows:


def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        x = self.conv6(x)

        # collapse
        x = x.view(x.size(0), -1)
        # linear layer
        x = self.fc1(x)
        # linear layer
        x = self.fc2(x)
        # linear layer
        x = self.fc3(x)
        # output layer
        x = self.log_softmax(x)
        
return x

I just want to ignore the self.fc3. In the original trained model it is self.fc3 = nn.Linear(1024, 4) but in my model it will be self.fc3 = nn.Linear(1024, 6) therefore I want that part to be randomly initialized.

Is it possilbe to do that ? If so, could you help me ?

EDIT

Actually I know that, I can do the following in order to reach the individual layers’ weights:

>>>>checkpoint = torch.load(args.model_path)
>>>>checkpoint['state_dict'].keys()
odict_keys(['conv1.0.weight', 'conv1.0.bias', 'conv2.0.weight', 'conv2.0.bias', 'conv3.0.weight', 'conv3.0.bias', 'conv4.0.weight', 'conv4.0.bias', 'conv5.0.weight', 'conv5.0.bias', 'conv6.0.weight', 'conv6.0.bias', 'fc1.0.weight', 'fc1.0.bias', 'fc2.0.weight', 'fc2.0.bias', 'fc3.weight', 'fc3.bias'])
>>>> newmodel = charCNN(args) # How to use odict_keys to initialize weights of newmodel with checkpoint?

However, I do not know how can I exclude the fc3.weight and fc3.bias and assign the other weights to the corresponding fields in the new model automatically.