Given two sets of parameters, I want to copy only the weights from one set of parameters to the other. Is it possible to do this?
I think you can do something like
params1 = model1.named_parameters() params2 = model2.named_parameters() dict_params2 = dict(params2) for name1, param1 in params1: if name1 in dict_params2: dict_params2[name1].data.copy_(param1.data)
Hello, can this achieved by load_static_dict? If net1() is build on the base of net2() and some other layer is added to net1(), and I want to finetune net1() using net2()'s weight, can I directly use load_static_dict? Thanks!
@chenchr I think it works. One can try
model.fc4.load_state_dict(model.fc3.state_dict()) to update
fc4 layer’s parameters using the
@fmassa, However not sure if
state_dict() or deepcopy would be a better choice. It would be great if you can elaborate upon the differences!
Is there a better way to copy layer parameters from one model to another in 2020 (when trying to transfer a trained encoder or something else)?
I created this helper function per the discussion above but it doesn’t seem to be working as expected!
def copyParams(module_src, module_dest): params_src = module_src.named_parameters() params_dest = module_dest.named_parameters() dict_dest = dict(params_dest) for name, param in params_src: if name in dict_dest: dict_dest[name].data.copy_(param.data)
Any news on this? I am also moving forward implementing this function. Basically, I want to do some operations that will accumulate the gradient information in the form of a delta_params. Then I want to apply it to the original params, and replace the params in the model with original_params + delta_params.
I believe that code actually does copy parameters as expected. I had another issue which is why I was getting strange results (I didn’t copy the original model’s decoder).