Layer-wise linear combination of two models

Hannes_F · February 17, 2023, 3:35pm

I have two trained neural network models M1 and M2 in Pytorch.

Both have exactly the same neural network architecture (same # of layers, which we will denote by ‘n’). Furthermore, I have a n-dimensional vector ‘lambda’, where each value ‘lambda_i’ is in range [0, 1].

What I want to calculate now is the ‘layer-wise’ linear combination ‘M = lambda * M1 + (1 - lambda) * M2’. Means that the i-th layer in M is calculated as the linear combination of the i-th layer in M1 (using factor ‘lambda_n’) and the i-th layer in M2 (using factor ‘1 - lambda_i’). The linear combination shall be done both for ‘weight’ and ‘bias’ of the i-th layer.

How can I calculate this in Pytorch ?

ptrblck · February 17, 2023, 7:44pm

You could iterate both state_dicts of the models, create new parameters by applying your weighting, and create a new state_dict. Afterwards, you could load it into a model object.

Hannes_F · February 18, 2023, 10:39am

Thank you !

How can I determine which keys of the ‘state_dict’ dictionary belong to the same layer ?

I suppose for each key its layer name is the first part of the key name, before the last occurence of the dot character “.” in the key name string.
Is that right ?

As in the example at Saving and Loading Models — PyTorch Tutorials 1.0.0.dev20181128 documentation
and in How do I update old key name in state_dict to new version

Btw, nice greetings to California. I hope GTC 2024 is physically again, looking forward to present again at this great conference.

ptrblck · February 18, 2023, 6:17pm

Each key of the state_dict will be unique and represents a registered parameter or buffer. Since both models have the exact same architecture you should be able to iterate one state_dict and use its key to index both dicts inside the loop.

Greetings back!