Yes, you could get the state_dicts
of both models, average the parameters and reload the new state_dict
.
Here is a small dummy example:
# Setup
modelA = nn.Linear(1, 1)
modelB = nn.Linear(1, 1)
sdA = modelA.state_dict()
sdB = modelB.state_dict()
# Average all parameters
for key in sdA:
sdB[key] = (sdB[key] + sdA[key]) / 2.
# Recreate model and load averaged state_dict (or use modelA/B)
model = nn.Linear(1, 1)
model.load_state_dict(sdB)