Copy weights only from a network's parameters

Given two sets of parameters, I want to copy only the weights from one set of parameters to the other. Is it possible to do this?

1 Like

I think you can do something like

params1 = model1.named_parameters()
params2 = model2.named_parameters()

dict_params2 = dict(params2)

for name1, param1 in params1:
    if name1 in dict_params2:
        dict_params2[name1].data.copy_(param1.data)
7 Likes

@fmassa
Hello, can this achieved by load_static_dict? If net1() is build on the base of net2() and some other layer is added to net1(), and I want to finetune net1() using net2()'s weight, can I directly use load_static_dict? Thanks!

@chenchr I think it works. One can try model.fc4.load_state_dict(model.fc3.state_dict()) to updatefc4 layer’s parameters using the fc3 layer.

@fmassa, However not sure if clone() , state_dict() or deepcopy would be a better choice. It would be great if you can elaborate upon the differences!

Is there a better way to copy layer parameters from one model to another in 2020 (when trying to transfer a trained encoder or something else)?

I created this helper function per the discussion above but it doesn’t seem to be working as expected!

def copyParams(module_src, module_dest):
    params_src = module_src.named_parameters()
    params_dest = module_dest.named_parameters()

    dict_dest = dict(params_dest)

    for name, param in params_src:
        if name in dict_dest:
            dict_dest[name].data.copy_(param.data)

Any news on this? I am also moving forward implementing this function. Basically, I want to do some operations that will accumulate the gradient information in the form of a delta_params. Then I want to apply it to the original params, and replace the params in the model with original_params + delta_params.

I believe that code actually does copy parameters as expected. I had another issue which is why I was getting strange results (I didn’t copy the original model’s decoder).