Copying part of the weights

I want to copy a part of the weight from one network to another.
Using something like polyak averaging

Example:

weights_new = k*weights_old + (1-k)*weights_new

This is required to implement DDPG.

How can I do this?

Something like this should do

# per layer and per weight param
other_model.layer.weight.data = k * model.layer.weight.data + (1-k) * other_model.layer.weight.data
1 Like

your solution is missing a for loop, no? How do you actually do this with a for loop?

Error message:

>>> net
Sequential(
  (0): Linear(in_features=2, out_features=2)
  (1): Linear(in_features=2, out_features=2)
)
>>> net.layer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/brandomiranda/miniconda3/envs/pytorch_overparam/lib/python3.6/site-packages/torch/nn/modules/module.py", line 366, in __getattr__
    type(self).__name__, name))
AttributeError: 'Sequential' object has no attribute 'layer'
1 Like

real solution:

beta = 0.5 #The interpolation parameter    
params1 = model1.named_parameters()
params2 = model2.named_parameters()

dict_params2 = dict(params2)

for name1, param1 in params1:
    if name1 in dict_params2:
        dict_params2[name1].data.copy_(beta*param1.data + (1-beta)*dict_params2[name1].data)

model.load_state_dict(dict_params2)
4 Likes

named_paramters() doesn’t works well with my code. I got some “missing keys” problem when “load_state_dict”.
state_dict() is the solution.

beta = 0.5 #The interpolation parameter
params1 = model1.state_dict()
params2 = model2.state_dict()

dict_params2 = dict(params2)

for name1, param1 in params1:
if name1 in dict_params2:
dict_params2[name1].data.copy_(beta*param1.data + (1-beta)*dict_params2[name1].data)

model.load_state_dict(dict_params2)

3 Likes

Tried to apply this to load pretrained weights from resnet18 and it failed on loading batchnorm.running_mean