Combining Trained Models in PyTorch

ptrblck · August 15, 2020, 12:18am

IT seems that the “soft parameter sharing” could be implemented as a regularization term.
You could follow this post, which gives you a general way to add a regularization term to your loss.
In your use case you would either have to calculate the regularization terms for all models or use the “average approach”.