Loading weights from pretrained model with different module names

Nicholas_Wickman · January 4, 2018, 5:56am

I would like to initialize my model with the first 14 layers of VGG16. A nice solution suggested at https://discuss.pytorch.org/t/how-to-load-part-of-pre-trained-model/1113/3?u=nicholas_wickman does not work for me, because my parameters are not named the same as the model I would like to load from.

My parameters are named like conv1.weight and conv1.bias. VGG’s layers are named like features.0.weight and features.0.bias.

Does anyone have any suggestions?

pretrained_dict = torch.load('VGG_dict.pth')
model_dict = model.state_dict()

# Modification to the dictionary will go here?

model_dict.update(pretrained_dict)
model.load_state_dict(model_dict)

smth · January 4, 2018, 5:57am

you could change the key names of the state dict to match your layer’s key names…

vsag · February 10, 2018, 11:41am

@smth shouldn’t this work?

pre_trained_model=torch.load("Path to the .pth file")
 new=list(pre_trained.items())

my_model_kvpair=mymodel.state_dict()
count=0
for key,value in my_model_kvpair.item():
  layer_name,weights=new[count]      
mymodel_kvpair[key]=weights
count+=1

lijunzh · June 22, 2018, 2:46pm

Thanks a lot. This is exactly what I am looking for.

In addition, do you know a good way to transfer the optimizer state to the new model with different names? The reason I am asking here is that the pre-trained model I have is constructed via nn.Sequential, but I want to insert some new blocks in between and retrain the new network. It will be easier for me if I can start with what they have.

vsag · June 22, 2018, 3:03pm

It’s been a While since I have used pytorch, I have no idea how to do that. But you can find it easily as it was asked before iirc.

lijunzh · June 22, 2018, 3:15pm

Thanks for the prompt reply. I will search around again.

Shijie_Wu · December 13, 2018, 11:37pm

For me on a nested nn, this code piece did not work. Though the dictionary is updated as expected, the weights and bias actually didn’t change. Do you have other recommendations?

Ayush_Gupta · October 5, 2021, 2:09pm

Pretty old thread, but I was having the same issue and have solved it. Here is what you need to do:

 new=list(pre_trained.items())

my_model_kvpair=my_model.state_dict()
count=0
for key,value in my_model_kvpair.item():
  layer_name,weights=new[count]      
  mymodel_kvpair[key]=weights
  count+=1

my_model.load_state_dict(my_model_kv_pair)

After modifying the key value pairs, you need to actually load the updated state_dict back into your model so that the weights and biases change

naveenmarthala · August 4, 2022, 11:14am

I would modify it to:

    mymodel_kvpair[key] = weights.detach().clone()