Combining Trained Models in PyTorch

IT seems that the “soft parameter sharing” could be implemented as a regularization term.
You could follow this post, which gives you a general way to add a regularization term to your loss.
In your use case you would either have to calculate the regularization terms for all models or use the “average approach”.

1 Like