How to assing average of two models' weights into a new model?

Stefano_Setti · August 13, 2020, 2:55pm

Hi, so i have this network:

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()

        self.conv2d = nn.Sequential(
                nn.Conv2d(1, 64, (3,6), (1, 1)),
                nn.Hardsigmoid()
            )

    def forward(self, input):
        output = self.conv2d(input)
        
        return output

The network is trained on two different dataset so to obtain two different models:

model1
model2

my objective is to average the weights and bias of them and put the result on a third model:
model3

How can i achieve that?

mathematics · August 13, 2020, 4:40pm

Cool question, I’ve tried,

I think, here’s you can solve this,
We can get weights of any model by model.parameters() which can be append into list as below

params1 = []
for param in model1.parameters():
  params1.append(param.data)

similary do this to trained model2 and save in list params2

Now you initialize weights of model3 as

model3 = Network()
params3 = iter(params1 + params2)
for param in model3.parameters():
  param.data = next(params3)

Stefano_Setti · August 14, 2020, 9:22am

but in this way there is no chance to average the weight from model1 and model2 right?

mathematics · August 14, 2020, 10:04am

oh, You told to average weights and put on model3, I’ve misunderstood it,

I think your question is really to ensemble two or more pretrained model, here you can do that
https://discuss.pytorch.org/t/combining-trained-models-in-pytorch/28383
Hope it solves your problem

Stefano_Setti · August 14, 2020, 10:12am

no in the post that you sended me he combine two models, my goal is to get two identical model, get weights and bias and average them into a third model that has the same structure of the previous two

mathematics · August 14, 2020, 11:34am

model parameters are actually weights and biases. As earlier, you can also get name of those parameters as

for name, params in model.named_parameters():
  print(name)

gives your weights and biases names as

conv2d.0.weight
conv2d.0.bias
dense.0.weight
dense.0.bias