Transfer learning of weights to one model to another

I have a simple problem: I want to transfer the weights of one model to another, such that only the final layers of the model differ, an MLP for example.

I am loading the pretrained model with `load_state_dict, but it tries to copy ALL the weights of the pretrained model. How do I copy only the weights that I want, or that are existing in the model (aka, same layer name)?

You could adapt this code snippet and set your condition to the layer names you would like to copy.

Looks alright.
What is init_weights() doing?
Is it just initializing the last two linear layers?

I was looking at that exactly now! I’ll fix the init_weights. Thanks!!

I don’t know why but my previous post disappeared. Therefore, I am posting the solution once again:

def load_weights(self):
    pretrained_dict = torch.load('model.torch')
    model_dict = self.state_dict()

    # 1. filter out unnecessary keys
    pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}

    # 2. overwrite entries in the existing state dict
    model_dict.update(pretrained_dict)

    # 3. load the new state dict
    self.load_state_dict(model_dict)

    # 4. create final layer and initialize it
    self.linear = nn.Linear(self.output_size, self.n_classes)
    torch.nn.init.xavier_uniform_(self.linear.weight)
    self.linear.bias.data.fill_(0.01)
    self.cuda()  # put model on GPU once again
2 Likes

Hi, I use the above function to transfer training weight from another model. However, once I use model.apply(load_weights), and try to train the model, my weights don’t update. Is there any additional step after this function that should let optimizer update the weights?