I have a simple problem: I want to transfer the weights of one model to another, such that only the final layers of the model differ, an MLP for example.
I am loading the pretrained model with `load_state_dict, but it tries to copy ALL the weights of the pretrained model. How do I copy only the weights that I want, or that are existing in the model (aka, same layer name)?
I don’t know why but my previous post disappeared. Therefore, I am posting the solution once again:
def load_weights(self):
pretrained_dict = torch.load('model.torch')
model_dict = self.state_dict()
# 1. filter out unnecessary keys
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
# 2. overwrite entries in the existing state dict
model_dict.update(pretrained_dict)
# 3. load the new state dict
self.load_state_dict(model_dict)
# 4. create final layer and initialize it
self.linear = nn.Linear(self.output_size, self.n_classes)
torch.nn.init.xavier_uniform_(self.linear.weight)
self.linear.bias.data.fill_(0.01)
self.cuda() # put model on GPU once again
Hi, I use the above function to transfer training weight from another model. However, once I use model.apply(load_weights), and try to train the model, my weights don’t update. Is there any additional step after this function that should let optimizer update the weights?