Transfer learning of weights to one model to another

Skinish · August 27, 2018, 9:22am

I have a simple problem: I want to transfer the weights of one model to another, such that only the final layers of the model differ, an MLP for example.

I am loading the pretrained model with `load_state_dict, but it tries to copy ALL the weights of the pretrained model. How do I copy only the weights that I want, or that are existing in the model (aka, same layer name)?

ptrblck · August 27, 2018, 12:04pm

You could adapt this code snippet and set your condition to the layer names you would like to copy.

ptrblck · August 27, 2018, 12:17pm

Looks alright.
What is init_weights() doing?
Is it just initializing the last two linear layers?

Skinish · August 27, 2018, 12:22pm

I was looking at that exactly now! I’ll fix the init_weights. Thanks!!

Skinish · August 29, 2018, 9:49am

I don’t know why but my previous post disappeared. Therefore, I am posting the solution once again:

def load_weights(self):
    pretrained_dict = torch.load('model.torch')
    model_dict = self.state_dict()

    # 1. filter out unnecessary keys
    pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}

    # 2. overwrite entries in the existing state dict
    model_dict.update(pretrained_dict)

    # 3. load the new state dict
    self.load_state_dict(model_dict)

    # 4. create final layer and initialize it
    self.linear = nn.Linear(self.output_size, self.n_classes)
    torch.nn.init.xavier_uniform_(self.linear.weight)
    self.linear.bias.data.fill_(0.01)
    self.cuda()  # put model on GPU once again

Adroit · October 20, 2022, 5:14pm

Hi, I use the above function to transfer training weight from another model. However, once I use model.apply(load_weights), and try to train the model, my weights don’t update. Is there any additional step after this function that should let optimizer update the weights?