Combining model weight

Hi All,

I have a question on combining model weights accurately.
I made one NN and trained the model separately on two datasets. Now I am trying to obtain a single model out these two models by combining the weights.

What are different ways to combine the weights in anyone’s opinion?

Thanks in advance for answering this question

Hi Jees,

let me try to make sure I understand your use case completely.
You have one model architecture and you initialized two models using it.
These two model were trained on two different datasets and converged.

Now you would like to “combine” all parameters of both models and create a single one.
Using this new single model you would now want to predict both datasets or just one of them?
Do both datasets contain the same classes or are the targets completely different?

Thanks a lot @ptrblck for the queries.

You have one model architecture and you initialized two models using it.

Yes that is very true

These two model were trained on two different datasets and converged.

that is also correct

Now you would like to “combine” all parameters of both models and create a single one.

that is the idea, yes

Using this new single model you would now want to predict both datasets or just one of them?

I want it to predict both of them with reasonable accuracy

Do both datasets contain the same classes or are the targets completely different?

completely different (we are not sure but let me consider that case)

Thanks for queries I hope this can help you to give me some pointers.

Hi,
Can we club two weight file into one weight file without loss of catastrophic loss with data A and data B,
In our case we trained initial some data and after some time I get some more new data. So I train only new data and club both weight file into one.
Number of targets is same class and both data file in same domain.
I club both model but I found catastrophic loss will be happen
Please suggest me what is the best way

I don’t think combining different trained parameters will automatically result in a good new model, as both models might (and most likely) have converged to different local minima.
The mean (I assume you are taking the average of all parameters) will not necessarily yield another minimum on the loss surface.

You could try to fine tune your model by retraining with the new samples and a low learning rate and check the validation accuracy again on the updated dataset.