Problem: merge two model parameters into other model with same network structure

The code is flowing:

if epoch > self.epoch_merge:
    if epoch == self.epoch_merge:
        for i,param in enumerate(zip(self.decoder1.parameters(),self.decoder2.parameters(),self.decoder_merge.parameters())):
            param[2].data = param[0].data + param[1].data
    out_merge = self.decoder_merge(feature_merge_list)

My goal is to merge two all parameters of decoders into a new decoder(named decoder_merge) with same network structure,then the new decoder will be trained continuedly.However, When the epoch equals to epoch_merge,I got a very low accuracy than previous one. After this,the model accuracy was increased normally.
Therefore,I want to know what cause the cliff descent? Is it right for my merge measure?Thanks in advanced.

Maybe the optimizer use some momentum values, which are optimised for the original values?

The optimizer is adam in my code.

Do you know other methods about loading model parameters?Thanks a lot.

Was the accuracy of the two separate decoders high, while the new “merged” decoder yields a low accuracy?
If so, I think this might be expected, if the parameters are not “close” to each other (whatever this means in the high dimensional space). There are methods such as stochastic weight average, which might help, but I don’t think that calculating the “mean” model of two trained models yields generally a better accuracy.

1 Like

Thank you,you are right about what you said.I just want to ensure the correctness for the code.