if epoch > self.epoch_merge:
if epoch == self.epoch_merge:
for i,param in enumerate(zip(self.decoder1.parameters(),self.decoder2.parameters(),self.decoder_merge.parameters())):
param.data = param.data + param.data
out_merge = self.decoder_merge(feature_merge_list)
My goal is to merge two all parameters of decoders into a new decoder(named decoder_merge) with same network structure,then the new decoder will be trained continuedly.However, When the epoch equals to epoch_merge,I got a very low accuracy than previous one. After this,the model accuracy was increased normally.
Therefore,I want to know what cause the cliff descent? Is it right for my merge measure?Thanks in advanced.
Was the accuracy of the two separate decoders high, while the new “merged” decoder yields a low accuracy?
If so, I think this might be expected, if the parameters are not “close” to each other (whatever this means in the high dimensional space). There are methods such as stochastic weight average, which might help, but I don’t think that calculating the “mean” model of two trained models yields generally a better accuracy.