I have two trained neural network models M1 and M2 in Pytorch.
Both have exactly the same neural network architecture (same # of layers, which we will denote by ‘n’). Furthermore, I have a n-dimensional vector ‘lambda’, where each value ‘lambda_i’ is in range [0, 1].
What I want to calculate now is the ‘layer-wise’ linear combination ‘M = lambda * M1 + (1 - lambda) * M2’. Means that the i-th layer in M is calculated as the linear combination of the i-th layer in M1 (using factor ‘lambda_n’) and the i-th layer in M2 (using factor ‘1 - lambda_i’). The linear combination shall be done both for ‘weight’ and ‘bias’ of the i-th layer.
You could iterate both state_dicts of the models, create new parameters by applying your weighting, and create a new state_dict. Afterwards, you could load it into a model object.
How can I determine which keys of the ‘state_dict’ dictionary belong to the same layer ?
I suppose for each key its layer name is the first part of the key name, before the last occurence of the dot character “.” in the key name string.
Is that right ?
Each key of the state_dict will be unique and represents a registered parameter or buffer. Since both models have the exact same architecture you should be able to iterate one state_dict and use its key to index both dicts inside the loop.