Loading weights from pretrained model with different module names

I would like to initialize my model with the first 14 layers of VGG16. A nice solution suggested at https://discuss.pytorch.org/t/how-to-load-part-of-pre-trained-model/1113/3?u=nicholas_wickman does not work for me, because my parameters are not named the same as the model I would like to load from.

My parameters are named like conv1.weight and conv1.bias. VGG’s layers are named like features.0.weight and features.0.bias.

Does anyone have any suggestions?

pretrained_dict = torch.load('VGG_dict.pth')
model_dict = model.state_dict()

# Modification to the dictionary will go here?

model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)
1 Like

you could change the key names of the state dict to match your layer’s key names…

@smth shouldn’t this work?

pre_trained_model=torch.load("Path to the .pth file")
 new=list(pre_trained.items())

my_model_kvpair=mymodel.state_dict()
count=0
for key,value in my_model_kvpair.item():
  layer_name,weights=new[count]      
mymodel_kvpair[key]=weights
count+=1
6 Likes

Thanks a lot. This is exactly what I am looking for.

In addition, do you know a good way to transfer the optimizer state to the new model with different names? The reason I am asking here is that the pre-trained model I have is constructed via nn.Sequential, but I want to insert some new blocks in between and retrain the new network. It will be easier for me if I can start with what they have.

It’s been a While since I have used pytorch, I have no idea how to do that. But you can find it easily as it was asked before iirc.

Thanks for the prompt reply. I will search around again.

For me on a nested nn, this code piece did not work. Though the dictionary is updated as expected, the weights and bias actually didn’t change. Do you have other recommendations?

Pretty old thread, but I was having the same issue and have solved it. Here is what you need to do:

 new=list(pre_trained.items())

my_model_kvpair=my_model.state_dict()
count=0
for key,value in my_model_kvpair.item():
  layer_name,weights=new[count]      
  mymodel_kvpair[key]=weights
  count+=1

my_model.load_state_dict(my_model_kv_pair)

After modifying the key value pairs, you need to actually load the updated state_dict back into your model so that the weights and biases change

5 Likes

I would modify it to:

    mymodel_kvpair[key] = weights.detach().clone()